Skip to main content

Showing 1–50 of 362 results for author: Qi, Y

  1. arXiv:2407.12999  [pdf, other

    cs.CY cs.AI cs.CR

    Securing the Future of GenAI: Policy and Technology

    Authors: Mihai Christodorescu, Ryan Craven, Soheil Feizi, Neil Gong, Mia Hoffmann, Somesh Jha, Zhengyuan Jiang, Mehrdad Saberi Kamarposhti, John Mitchell, Jessica Newman, Emelia Probasco, Yanjun Qi, Khawaja Shams, Matthew Turek

    Abstract: The rise of Generative AI (GenAI) brings about transformative potential across sectors, but its dual-use nature also amplifies risks. Governments globally are grappling with the challenge of regulating GenAI, balancing innovation against safety. China, the United States (US), and the European Union (EU) are at the forefront with initiatives like the Management of Algorithmic Recommendations, the E… ▽ More

    Submitted 21 May, 2024; originally announced July 2024.

  2. arXiv:2407.12532  [pdf, other

    cs.CL cs.AI

    Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models

    Authors: Xihe Qiu, Haoyu Wang, Xiaoyu Tan, Chao Qu, Yujie Xiong, Yuan Cheng, Yinghui Xu, Wei Chu, Yuan Qi

    Abstract: Effective collaboration in multi-agent systems requires communicating goals and intentions between agents. Current agent frameworks often suffer from dependencies on single-agent execution and lack robust inter-module communication, frequently leading to suboptimal multi-agent reinforcement learning (MARL) policies and inadequate task coordination. To address these challenges, we present a framewo… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  3. arXiv:2407.12522  [pdf, other

    cs.CL cs.AI

    Struct-X: Enhancing Large Language Models Reasoning with Structured Data

    Authors: Xiaoyu Tan, Haoyu Wang, Xihe Qiu, Yuan Cheng, Yinghui Xu, Wei Chu, Yuan Qi

    Abstract: Structured data, rich in logical and relational information, has the potential to enhance the reasoning abilities of large language models (LLMs). Still, its integration poses a challenge due to the risk of overwhelming LLMs with excessive tokens and irrelevant context information. To address this, we propose Struct-X, a novel framework that operates through five key phases: ``read-model-fill-refl… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  4. arXiv:2407.12217  [pdf, other

    cs.CV

    AFIDAF: Alternating Fourier and Image Domain Adaptive Filters as an Efficient Alternative to Attention in ViTs

    Authors: Yunling Zheng, Zeyi Xu, Fanghui Xue, Biao Yang, Jiancheng Lyu, Shuai Zhang, Yingyong Qi, Jack Xin

    Abstract: We propose and demonstrate an alternating Fourier and image domain filtering approach for feature extraction as an efficient alternative to build a vision backbone without using the computationally intensive attention. The performance among the lightweight models reaches the state-of-the-art level on ImageNet-1K classification, and improves downstream tasks on object detection and segmentation con… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  5. arXiv:2407.11700  [pdf, other

    cs.CV eess.IV

    Rate-Distortion-Cognition Controllable Versatile Neural Image Compression

    Authors: Jinming Liu, Ruoyu Feng, Yunpeng Qi, Qiuyu Chen, Zhibo Chen, Wenjun Zeng, Xin Jin

    Abstract: Recently, the field of Image Coding for Machines (ICM) has garnered heightened interest and significant advances thanks to the rapid progress of learning-based techniques for image compression and analysis. Previous studies often require training separate codecs to support various bitrate levels, machine tasks, and networks, thus lacking both flexibility and practicality. To address these challeng… ▽ More

    Submitted 17 July, 2024; v1 submitted 16 July, 2024; originally announced July 2024.

    Comments: ECCV2024

  6. arXiv:2407.11298  [pdf, other

    cs.RO

    ThinkGrasp: A Vision-Language System for Strategic Part Grasping in Clutter

    Authors: Yaoyao Qian, Xupeng Zhu, Ondrej Biza, Shuo Jiang, Linfeng Zhao, Haojie Huang, Yu Qi, Robert Platt

    Abstract: Robotic grasping in cluttered environments remains a significant challenge due to occlusions and complex object arrangements. We have developed ThinkGrasp, a plug-and-play vision-language grasping system that makes use of GPT-4o's advanced contextual reasoning for heavy clutter environment grasping strategies. ThinkGrasp can effectively identify and generate grasp poses for target objects, even wh… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Project Website:(https://h-freax.github.io/thinkgrasp_page/)

  7. arXiv:2407.08500  [pdf, other

    cs.LG cs.AI

    Latent Conditional Diffusion-based Data Augmentation for Continuous-Time Dynamic Graph Mode

    Authors: Yuxing Tian, Yiyan Qi, Aiwen Jiang, Qi Huang, Jian Guo

    Abstract: Continuous-Time Dynamic Graph (CTDG) precisely models evolving real-world relationships, drawing heightened interest in dynamic graph learning across academia and industry. However, existing CTDG models encounter challenges stemming from noise and limited historical data. Graph Data Augmentation (GDA) emerges as a critical solution, yet current approaches primarily focus on static graphs and strug… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by KDD 2024

  8. arXiv:2407.08167  [pdf, other

    eess.IV cs.CV

    DSCENet: Dynamic Screening and Clinical-Enhanced Multimodal Fusion for MPNs Subtype Classification

    Authors: Yuan Zhang, Yaolei Qi, Xiaoming Qi, Yongyue Wei, Guanyu Yang

    Abstract: The precise subtype classification of myeloproliferative neoplasms (MPNs) based on multimodal information, which assists clinicians in diagnosis and long-term treatment plans, is of great clinical significance. However, it remains a great challenging task due to the lack of diagnostic representativeness for local patches and the absence of diagnostic-relevant features from a single modality. In th… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by MICCAI2024

  9. arXiv:2407.05915  [pdf, other

    cs.NI cs.AI

    Harnessing Federated Generative Learning for Green and Sustainable Internet of Things

    Authors: Yuanhang Qi, M. Shamim Hossain

    Abstract: The rapid proliferation of devices in the Internet of Things (IoT) has ushered in a transformative era of data-driven connectivity across various domains. However, this exponential growth has raised pressing concerns about environmental sustainability and data privacy. In response to these challenges, this paper introduces One-shot Federated Learning (OSFL), an innovative paradigm that harmonizes… ▽ More

    Submitted 30 April, 2024; originally announced July 2024.

    Comments: This paper is a correction of the published version, in which we corrected the grammatical errors between contexts and highlighted the relationship with "Federated generative learning with foundation models"

  10. arXiv:2407.05005  [pdf, other

    cs.LG cs.DC

    Personalized Federated Domain-Incremental Learning based on Adaptive Knowledge Matching

    Authors: Yichen Li, Wenchao Xu, Haozhao Wang, Ruixuan Li, Yining Qi, Jingcai Guo

    Abstract: This paper focuses on Federated Domain-Incremental Learning (FDIL) where each client continues to learn incremental tasks where their domain shifts from each other. We propose a novel adaptive knowledge matching-based personalized FDIL approach (pFedDIL) which allows each client to alternatively utilize appropriate incremental task learning strategy on the correlation with the knowledge from previ… ▽ More

    Submitted 18 July, 2024; v1 submitted 6 July, 2024; originally announced July 2024.

  11. arXiv:2407.04020  [pdf, other

    cs.CL

    LLMAEL: Large Language Models are Good Context Augmenters for Entity Linking

    Authors: Amy Xin, Yunjia Qi, Zijun Yao, Fangwei Zhu, Kaisheng Zeng, Xu Bin, Lei Hou, Juanzi Li

    Abstract: Entity Linking (EL) models are well-trained at mapping mentions to their corresponding entities according to a given context. However, EL models struggle to disambiguate long-tail entities due to their limited training data. Meanwhile, large language models (LLMs) are more robust at interpreting uncommon mentions. Yet, due to a lack of specialized training, LLMs suffer at generating correct entity… ▽ More

    Submitted 15 July, 2024; v1 submitted 4 July, 2024; originally announced July 2024.

  12. arXiv:2407.03900  [pdf, other

    cs.CV

    Oracle Bone Inscriptions Multi-modal Dataset

    Authors: Bang Li, Donghao Luo, Yujie Liang, Jing Yang, Zengmao Ding, Xu Peng, Boyuan Jiang, Shengwei Han, Dan Sui, Peichao Qin, Pian Wu, Chaoyang Wang, Yun Qi, Taisong Jin, Chengjie Wang, Xiaoming Huang, Zhan Shu, Rongrong Ji, Yongge Liu, Yunsheng Wu

    Abstract: Oracle bone inscriptions(OBI) is the earliest developed writing system in China, bearing invaluable written exemplifications of early Shang history and paleography. However, the task of deciphering OBI, in the current climate of the scholarship, can prove extremely challenging. Out of the 4,500 oracle bone characters excavated, only a third have been successfully identified. Therefore, leveraging… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  13. arXiv:2407.02073  [pdf, other

    cs.LG

    Contribution Evaluation of Heterogeneous Participants in Federated Learning via Prototypical Representations

    Authors: Qi Guo, Minghao Yao, Zhen Tian, Saiyu Qi, Yong Qi, Yun Lin, Jin Song Dong

    Abstract: Contribution evaluation in federated learning (FL) has become a pivotal research area due to its applicability across various domains, such as detecting low-quality datasets, enhancing model robustness, and designing incentive mechanisms. Existing contribution evaluation methods, which primarily rely on data volume, model similarity, and auxiliary test datasets, have shown success in diverse scena… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  14. arXiv:2407.00921  [pdf, other

    cs.CV

    PointViG: A Lightweight GNN-based Model for Efficient Point Cloud Analysis

    Authors: Qiang Zheng, Yafei Qi, Chen Wang, Chao Zhang, Jian Sun

    Abstract: In the domain of point cloud analysis, despite the significant capabilities of Graph Neural Networks (GNNs) in managing complex 3D datasets, existing approaches encounter challenges like high computational costs and scalability issues with extensive scenarios. These limitations restrict the practical deployment of GNNs, notably in resource-constrained environments. To address these issues, this st… ▽ More

    Submitted 30 June, 2024; originally announced July 2024.

  15. arXiv:2407.00365  [pdf, other

    cs.CL

    Financial Knowledge Large Language Model

    Authors: Cehao Yang, Chengjin Xu, Yiyan Qi

    Abstract: Artificial intelligence is making significant strides in the finance industry, revolutionizing how data is processed and interpreted. Among these technologies, large language models (LLMs) have demonstrated substantial potential to transform financial services by automating complex tasks, enhancing customer service, and providing detailed financial analysis. Firstly, we introduce IDEA-FinBench, an… ▽ More

    Submitted 29 June, 2024; originally announced July 2024.

    Comments: 66 pages

  16. arXiv:2406.19143  [pdf, other

    cs.DB cs.DS

    QSketch: An Efficient Sketch for Weighted Cardinality Estimation in Streams

    Authors: Yiyan Qi, Rundong Li, Pinghui Wang, Yufang Sun, Rui Xing

    Abstract: Estimating cardinality, i.e., the number of distinct elements, of a data stream is a fundamental problem in areas like databases, computer networks, and information retrieval. This study delves into a broader scenario where each element carries a positive weight. Unlike traditional cardinality estimation, limited research exists on weighted cardinality, with current methods requiring substantial m… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: 12 pages, 10 figures, accepted by KDD 2024

  17. arXiv:2406.18967  [pdf, other

    cs.CV

    Structural Attention: Rethinking Transformer for Unpaired Medical Image Synthesis

    Authors: Vu Minh Hieu Phan, Yutong Xie, Bowen Zhang, Yuankai Qi, Zhibin Liao, Antonios Perperidis, Son Lam Phung, Johan W. Verjans, Minh-Son To

    Abstract: Unpaired medical image synthesis aims to provide complementary information for an accurate clinical diagnostics, and address challenges in obtaining aligned multi-modal medical scans. Transformer-based models excel in imaging translation tasks thanks to their ability to capture long-range dependencies. Although effective in supervised training settings, their performance falters in unpaired image… ▽ More

    Submitted 27 June, 2024; originally announced June 2024.

    Comments: MICCAI2024 - Early Accept Top 11%

  18. arXiv:2406.18327  [pdf, other

    eess.IV cs.CV cs.LG

    Multi-modal Evidential Fusion Network for Trusted PET/CT Tumor Segmentation

    Authors: Yuxuan Qi, Li Lin, Jiajun Wang, Jingya Zhang, Bin Zhang

    Abstract: Accurate segmentation of tumors in PET/CT images is important in computer-aided diagnosis and treatment of cancer. The key issue of such a segmentation problem lies in the effective integration of complementary information from PET and CT images. However, the quality of PET and CT images varies widely in clinical settings, which leads to uncertainty in the modality information extracted by network… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  19. arXiv:2406.15658  [pdf, other

    cs.CV cs.AI

    TorchSpatial: A Location Encoding Framework and Benchmark for Spatial Representation Learning

    Authors: Nemin Wu, Qian Cao, Zhangyu Wang, Zeping Liu, Yanlin Qi, Jielu Zhang, Joshua Ni, Xiaobai Yao, Hongxu Ma, Lan Mu, Stefano Ermon, Tanuja Ganu, Akshay Nambi, Ni Lao, Gengchen Mai

    Abstract: Spatial representation learning (SRL) aims at learning general-purpose neural network representations from various types of spatial data (e.g., points, polylines, polygons, networks, images, etc.) in their native formats. Learning good spatial representations is a fundamental problem for various downstream applications such as species distribution modeling, weather forecasting, trajectory generati… ▽ More

    Submitted 21 June, 2024; originally announced June 2024.

    Comments: 9 pages, 2 figures. Submitted to NeurIPS 2024 Datasets and Benchmarks Track. Under review

  20. arXiv:2406.14360  [pdf, other

    cs.CV

    Deblurring Neural Radiance Fields with Event-driven Bundle Adjustment

    Authors: Yunshan Qi, Lin Zhu, Yifan Zhao, Nan Bao, Jia Li

    Abstract: Neural Radiance Fields (NeRF) achieve impressive 3D representation learning and novel view synthesis results with high-quality multi-view images as input. However, motion blur in images often occurs in low-light and high-speed motion scenes, which significantly degrade the reconstruction quality of NeRF. Previous deblurring NeRF methods are struggling to estimate information during the exposure ti… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  21. arXiv:2406.11160  [pdf, other

    cs.AI

    Context Graph

    Authors: Chengjin Xu, Muzhi Li, Cehao Yang, Xuhui Jiang, Lumingyuan Tang, Yiyan Qi, Jian Guo

    Abstract: Knowledge Graphs (KGs) are foundational structures in many AI applications, representing entities and their interrelations through triples. However, triple-based KGs lack the contextual information of relational knowledge, like temporal dynamics and provenance details, which are crucial for comprehensive knowledge representation and effective reasoning. Instead, \textbf{Context Graphs} (CGs) expan… ▽ More

    Submitted 27 June, 2024; v1 submitted 16 June, 2024; originally announced June 2024.

  22. arXiv:2406.02622  [pdf, other

    cs.CR cs.AI

    Safeguarding Large Language Models: A Survey

    Authors: Yi Dong, Ronghui Mu, Yanghao Zhang, Siqi Sun, Tianle Zhang, Changshun Wu, Gaojie Jin, Yi Qi, Jinwei Hu, Jie Meng, Saddek Bensalem, Xiaowei Huang

    Abstract: In the burgeoning field of Large Language Models (LLMs), developing a robust safety mechanism, colloquially known as "safeguards" or "guardrails", has become imperative to ensure the ethical use of LLMs within prescribed boundaries. This article provides a systematic literature review on the current status of this critical mechanism. It discusses its major challenges and how it can be enhanced int… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: under review. arXiv admin note: text overlap with arXiv:2402.01822

  23. arXiv:2406.02407  [pdf, other

    cs.CV

    WE-GS: An In-the-wild Efficient 3D Gaussian Representation for Unconstrained Photo Collections

    Authors: Yuze Wang, Junyi Wang, Yue Qi

    Abstract: Novel View Synthesis (NVS) from unconstrained photo collections is challenging in computer graphics. Recently, 3D Gaussian Splatting (3DGS) has shown promise for photorealistic and real-time NVS of static scenes. Building on 3DGS, we propose an efficient point-based differentiable rendering framework for scene reconstruction from photo collections. Our key innovation is a residual-based spherical… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

    Comments: Our project page is available at https://yuzewang1998.github.io/we-gs.github.io/

  24. arXiv:2406.01256  [pdf, other

    cs.CV cs.AI

    Augmented Commonsense Knowledge for Remote Object Grounding

    Authors: Bahram Mohammadi, Yicong Hong, Yuankai Qi, Qi Wu, Shirui Pan, Javen Qinfeng Shi

    Abstract: The vision-and-language navigation (VLN) task necessitates an agent to perceive the surroundings, follow natural language instructions, and act in photo-realistic unseen environments. Most of the existing methods employ the entire image or object features to represent navigable viewpoints. However, these representations are insufficient for proper action prediction, especially for the REVERIE task… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  25. arXiv:2406.00993  [pdf

    eess.SP cs.HC q-bio.OT

    Detection of Acetone as a Gas Biomarker for Diabetes Based on Gas Sensor Technology

    Authors: Jiaming Wei, Tong Liu, Jipeng Huang, Xiaowei Li, Yurui Qi, Gangyin Luo

    Abstract: With the continuous development and improvement of medical services, there is a growing demand for improving diabetes diagnosis. Exhaled breath analysis, characterized by its speed, convenience, and non-invasive nature, is leading the trend in diagnostic development. Studies have shown that the acetone levels in the breath of diabetes patients are higher than normal, making acetone a basis for dia… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

    Comments: 9 pages, 14 figures

  26. arXiv:2405.20652  [pdf, other

    cs.LG

    Sign is Not a Remedy: Multiset-to-Multiset Message Passing for Learning on Heterophilic Graphs

    Authors: Langzhang Liang, Sunwoo Kim, Kijung Shin, Zenglin Xu, Shirui Pan, Yuan Qi

    Abstract: Graph Neural Networks (GNNs) have gained significant attention as a powerful modeling and inference method, especially for homophilic graph-structured data. To empower GNNs in heterophilic graphs, where adjacent nodes exhibit dissimilar labels or features, Signed Message Passing (SMP) has been widely adopted. However, there is a lack of theoretical and empirical analysis regarding the limitations… ▽ More

    Submitted 31 May, 2024; originally announced May 2024.

    Comments: Published as a conference paper at ICML 2024

  27. arXiv:2405.17846  [pdf, other

    cs.RO cs.AI

    Safety Control of Service Robots with LLMs and Embodied Knowledge Graphs

    Authors: Yong Qi, Gabriel Kyebambo, Siyuan Xie, Wei Shen, Shenghui Wang, Bitao Xie, Bin He, Zhipeng Wang, Shuo Jiang

    Abstract: Safety limitations in service robotics across various industries have raised significant concerns about the need for robust mechanisms ensuring that robots adhere to safe practices, thereby preventing actions that might harm humans or cause property damage. Despite advances, including the integration of Knowledge Graphs (KGs) with Large Language Models (LLMs), challenges in ensuring consistent saf… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  28. arXiv:2405.14808  [pdf, other

    cs.CL cs.AI cs.CY cs.HC cs.LG

    Implicit Personalization in Language Models: A Systematic Study

    Authors: Zhijing Jin, Nils Heil, Jiarui Liu, Shehzaad Dhuliawala, Yahang Qi, Bernhard Schölkopf, Rada Mihalcea, Mrinmaya Sachan

    Abstract: Implicit Personalization (IP) is a phenomenon of language models inferring a user's background from the implicit cues in the input prompts and tailoring the response based on this inference. While previous work has touched upon various instances of this problem, there lacks a unified framework to study this behavior. This work systematically studies IP through a rigorous mathematical formulation,… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  29. arXiv:2405.10746  [pdf, other

    cs.CY cs.AI

    Causality in the Can: Diet Coke's Impact on Fatness

    Authors: Yicheng Qi, Ang Li

    Abstract: Artificially sweetened beverages like Diet Coke are often considered healthier alternatives, but the debate over their impact on obesity persists. Previous research has predominantly relied on observational data or randomized controlled trials (RCTs), which may not accurately capture the causal relationship between Diet Coke consumption and obesity. This study uses causal inference methods, employ… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

    Comments: 9 pages, 4 figures, uses aaai24.sty

  30. arXiv:2405.07046  [pdf, other

    cs.CV

    Retrieval Enhanced Zero-Shot Video Captioning

    Authors: Yunchuan Ma, Laiyun Qing, Guorong Li, Yuankai Qi, Quan Z. Sheng, Qingming Huang

    Abstract: Despite the significant progress of fully-supervised video captioning, zero-shot methods remain much less explored. In this paper, we propose to take advantage of existing pre-trained large-scale vision and language models to directly generate captions with test time adaptation. Specifically, we bridge video and text using three key models: a general video understanding model XCLIP, a general imag… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  31. arXiv:2405.05925  [pdf, other

    cs.LG cs.AI physics.ao-ph

    FuXi-ENS: A machine learning model for medium-range ensemble weather forecasting

    Authors: Xiaohui Zhong, Lei Chen, Hao Li, Jun Liu, Xu Fan, Jie Feng, Kan Dai, Jing-Jia Luo, Jie Wu, Yuan Qi, Bo Lu

    Abstract: Ensemble forecasting is crucial for improving weather predictions, especially for forecasts of extreme events. Constructing an ensemble prediction system (EPS) based on conventional NWP models is highly computationally expensive. ML models have emerged as valuable tools for deterministic weather forecasts, providing forecasts with significantly reduced computational requirements and even surpassin… ▽ More

    Submitted 5 July, 2024; v1 submitted 9 May, 2024; originally announced May 2024.

  32. arXiv:2405.05008  [pdf, other

    cs.CL

    ADELIE: Aligning Large Language Models on Information Extraction

    Authors: Yunjia Qi, Hao Peng, Xiaozhi Wang, Bin Xu, Lei Hou, Juanzi Li

    Abstract: Large language models (LLMs) usually fall short on information extraction (IE) tasks and struggle to follow the complex instructions of IE tasks. This primarily arises from LLMs not being aligned with humans, as mainstream alignment datasets typically do not include IE data. In this paper, we introduce ADELIE (Aligning large language moDELs on Information Extraction), an aligned LLM that effective… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

  33. arXiv:2405.00699  [pdf, other

    cs.NE cs.AI cs.LG

    Direct Training Needs Regularisation: Anytime Optimal Inference Spiking Neural Network

    Authors: Dengyu Wu, Yi Qi, Kaiwen Cai, Gaojie Jin, Xinping Yi, Xiaowei Huang

    Abstract: Spiking Neural Network (SNN) is acknowledged as the next generation of Artificial Neural Network (ANN) and hold great promise in effectively processing spatial-temporal information. However, the choice of timestep becomes crucial as it significantly impacts the accuracy of the neural network training. Specifically, a smaller timestep indicates better performance in efficient computing, resulting i… ▽ More

    Submitted 15 April, 2024; originally announced May 2024.

  34. arXiv:2404.13388  [pdf

    eess.IV cs.CV cs.LG

    Diagnosis of Multiple Fundus Disorders Amidst a Scarcity of Medical Experts Via Self-supervised Machine Learning

    Authors: Yong Liu, Mengtian Kang, Shuo Gao, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Arokia Nathan, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Luigi Occhipinti

    Abstract: Fundus diseases are major causes of visual impairment and blindness worldwide, especially in underdeveloped regions, where the shortage of ophthalmologists hinders timely diagnosis. AI-assisted fundus image analysis has several advantages, such as high accuracy, reduced workload, and improved accessibility, but it requires a large amount of expert-annotated data to build reliable models. To addres… ▽ More

    Submitted 23 April, 2024; v1 submitted 20 April, 2024; originally announced April 2024.

  35. arXiv:2404.13386  [pdf

    eess.IV cs.CV cs.LG

    SSVT: Self-Supervised Vision Transformer For Eye Disease Diagnosis Based On Fundus Images

    Authors: Jiaqi Wang, Mengtian Kang, Yong Liu, Chi Zhang, Ying Liu, Shiming Li, Yue Qi, Wenjun Xu, Chenyu Tang, Edoardo Occhipinti, Mayinuer Yusufu, Ningli Wang, Weiling Bai, Shuo Gao, Luigi G. Occhipinti

    Abstract: Machine learning-based fundus image diagnosis technologies trigger worldwide interest owing to their benefits such as reducing medical resource power and providing objective evaluation results. However, current methods are commonly based on supervised methods, bringing in a heavy workload to biomedical staff and hence suffering in expanding effective databases. To address this issue, in this artic… ▽ More

    Submitted 20 April, 2024; originally announced April 2024.

    Comments: ISBI 2024

  36. arXiv:2404.10599  [pdf, other

    cs.NE

    Towards free-response paradigm: a theory on decision-making in spiking neural networks

    Authors: Zhichao Zhu, Yang Qi, Wenlian Lu, Zhigang Wang, Lu Cao, Jianfeng Feng

    Abstract: The energy-efficient and brain-like information processing abilities of Spiking Neural Networks (SNNs) have attracted considerable attention, establishing them as a crucial element of brain-inspired computing. One prevalent challenge encountered by SNNs is the trade-off between inference speed and accuracy, which requires sufficient time to achieve the desired level of performance. Drawing inspira… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

    Comments: 27 pages, 6 figures, 3 tables

  37. arXiv:2404.09544  [pdf, other

    cs.LG cs.AI

    GNNavigator: Towards Adaptive Training of Graph Neural Networks via Automatic Guideline Exploration

    Authors: Tong Qiao, Jianlei Yang, Yingjie Qi, Ao Zhou, Chen Bai, Bei Yu, Weisheng Zhao, Chunming Hu

    Abstract: Graph Neural Networks (GNNs) succeed significantly in many applications recently. However, balancing GNNs training runtime cost, memory consumption, and attainable accuracy for various applications is non-trivial. Previous training methodologies suffer from inferior adaptability and lack a unified training optimization solution. To address the problem, this work proposes GNNavigator, an adaptive G… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by DAC'24

  38. arXiv:2404.09497  [pdf, other

    cs.AR

    Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity

    Authors: Cenlin Duan, Jianlei Yang, Yiou Wang, Yikun Wang, Yingjie Qi, Xiaolin He, Bonan Yan, Xueyan Wang, Xiaotao Jia, Weisheng Zhao

    Abstract: Bit-level sparsity in neural network models harbors immense untapped potential. Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency. Yet, traditional digital SRAM-PIM architecture, limited by rigid crossbar architecture, struggles to effectively exploit this unstructured sparsity. To address this challenge, we propose Dyadic Block PIM… ▽ More

    Submitted 15 April, 2024; originally announced April 2024.

    Comments: Accepted by DAC'24

  39. arXiv:2404.08690  [pdf, other

    cs.CL cs.AI cs.CR cs.LG

    Towards Building a Robust Toxicity Predictor

    Authors: Dmitriy Bespalov, Sourav Bhabesh, Yi Xiang, Liutong Zhou, Yanjun Qi

    Abstract: Recent NLP literature pays little attention to the robustness of toxicity language predictors, while these systems are most likely to be used in adversarial contexts. This paper presents a novel adversarial attack, \texttt{ToxicTrap}, introducing small word-level perturbations to fool SOTA text classifiers to predict toxic text samples as benign. ToxicTrap exploits greedy based search strategies t… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: ACL 2023 /

  40. arXiv:2404.06579  [pdf, other

    cs.CL cs.AI cs.LG

    Less is More for Improving Automatic Evaluation of Factual Consistency

    Authors: Tong Wang, Ninad Kulkarni, Yanjun Qi

    Abstract: Assessing the factual consistency of automatically generated texts in relation to source context is crucial for developing reliable natural language generation applications. Recent literature proposes AlignScore which uses a unified alignment model to evaluate factual consistency and substantially outperforms previous methods across many benchmark tasks. In this paper, we take a closer look of dat… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted in NAACL24 Industry; 7 pages

  41. arXiv:2404.05605  [pdf, other

    cs.LG cs.AI

    Graph Neural Networks Automated Design and Deployment on Device-Edge Co-Inference Systems

    Authors: Ao Zhou, Jianlei Yang, Tong Qiao, Yingjie Qi, Zhi Yang, Weisheng Zhao, Chunming Hu

    Abstract: The key to device-edge co-inference paradigm is to partition models into computation-friendly and computation-intensive parts across the device and the edge, respectively. However, for Graph Neural Networks (GNNs), we find that simply partitioning without altering their structures can hardly achieve the full potential of the co-inference paradigm due to various computational-communication overhead… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: Accepted by DAC'24

  42. arXiv:2404.04902  [pdf, other

    cs.AI cs.SE

    AI2Apps: A Visual IDE for Building LLM-based AI Agent Applications

    Authors: Xin Pang, Zhucong Li, Jiaxiang Chen, Yuan Cheng, Yinghui Xu, Yuan Qi

    Abstract: We introduce AI2Apps, a Visual Integrated Development Environment (Visual IDE) with full-cycle capabilities that accelerates developers to build deployable LLM-based AI agent Applications. This Visual IDE prioritizes both the Integrity of its development tools and the Visuality of its components, ensuring a smooth and efficient building experience.On one hand, AI2Apps integrates a comprehensive de… ▽ More

    Submitted 7 April, 2024; originally announced April 2024.

  43. arXiv:2404.00849  [pdf, other

    cs.CV

    Generating Content for HDR Deghosting from Frequency View

    Authors: Tao Hu, Qingsen Yan, Yuankai Qi, Yanning Zhang

    Abstract: Recovering ghost-free High Dynamic Range (HDR) images from multiple Low Dynamic Range (LDR) images becomes challenging when the LDR images exhibit saturation and significant motion. Recent Diffusion Models (DMs) have been introduced in HDR imaging field, demonstrating promising performance, particularly in achieving visually perceptible results compared to previous DNN-based methods. However, DMs… ▽ More

    Submitted 31 March, 2024; originally announced April 2024.

    Comments: This paper is accepted by CVPR2024

  44. arXiv:2404.00611  [pdf, ps, other

    cs.CV

    Object-level Copy-Move Forgery Image Detection based on Inconsistency Mining

    Authors: Jingyu Wang, Niantai Jing, Ziyao Liu, Jie Nie, Yuxin Qi, Chi-Hung Chi, Kwok-Yan Lam

    Abstract: In copy-move tampering operations, perpetrators often employ techniques, such as blurring, to conceal tampering traces, posing significant challenges to the detection of object-level targets with intact structures. Focus on these challenges, this paper proposes an Object-level Copy-Move Forgery Image Detection based on Inconsistency Mining (IMNet). To obtain complete object-level targets, we custo… ▽ More

    Submitted 3 April, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: 4 pages, 2 figures, Accepted to WWW 2024

  45. arXiv:2403.19531  [pdf, other

    cs.CR cs.DB cs.SI

    SecGraph: Towards SGX-based Efficient and Confidentiality-Preserving Graph Search

    Authors: Qiuhao Wang, Xu Yang, Saiyu Qi, Yong Qi

    Abstract: Graphs have more expressive power and are widely researched in various search demand scenarios, compared with traditional relational and XML models. Today, many graph search services have been deployed on a third-party server, which can alleviate users from the burdens of maintaining large-scale graphs and huge computation costs. Nevertheless, outsourcing graph search services to the third-party s… ▽ More

    Submitted 28 March, 2024; originally announced March 2024.

    Comments: This paper has been accepted by DASFAA 2024

  46. arXiv:2403.18100  [pdf

    cs.AI cs.LG

    Driving Intelligent IoT Monitoring and Control through Cloud Computing and Machine Learning

    Authors: Hanzhe Li, Xiangxiang Wang, Yuan Feng, Yaqian Qi, Jingxiao Tian

    Abstract: This article explores how to drive intelligent iot monitoring and control through cloud computing and machine learning. As iot and the cloud continue to generate large and diverse amounts of data as sensor devices in the network, the collected data is sent to the cloud for statistical analysis, prediction, and data analysis to achieve business objectives. However, because the cloud computing model… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  47. arXiv:2403.17524  [pdf, other

    cs.CR cs.CL

    Provably Secure Disambiguating Neural Linguistic Steganography

    Authors: Yuang Qi, Kejiang Chen, Kai Zeng, Weiming Zhang, Nenghai Yu

    Abstract: Recent research in provably secure neural linguistic steganography has overlooked a crucial aspect: the sender must detokenize stegotexts to avoid raising suspicion from the eavesdropper. The segmentation ambiguity problem, which arises when using language models based on subwords, leads to occasional decoding failures in all neural language steganography implementations based on these models. Cur… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

  48. arXiv:2403.16037  [pdf, other

    cs.IR

    Knowledge-aware Dual-side Attribute-enhanced Recommendation

    Authors: Taotian Pang, Xingyu Lou, Fei Zhao, Zhen Wu, Kuiyao Dong, Qiuying Peng, Yue Qi, Xinyu Dai

    Abstract: \textit{Knowledge-aware} recommendation methods (KGR) based on \textit{graph neural networks} (GNNs) and \textit{contrastive learning} (CL) have achieved promising performance. However, they fall short in modeling fine-grained user preferences and further fail to leverage the \textit{preference-attribute connection} to make predictions, leading to sub-optimal performance. To address the issue, we… ▽ More

    Submitted 24 March, 2024; originally announced March 2024.

  49. arXiv:2403.11222  [pdf, other

    cs.CV

    SpikeNeRF: Learning Neural Radiance Fields from Continuous Spike Stream

    Authors: Lin Zhu, Kangmin Jia, Yifan Zhao, Yunshan Qi, Lizhi Wang, Hua Huang

    Abstract: Spike cameras, leveraging spike-based integration sampling and high temporal resolution, offer distinct advantages over standard cameras. However, existing approaches reliant on spike cameras often assume optimal illumination, a condition frequently unmet in real-world scenarios. To address this, we introduce SpikeNeRF, the first work that derives a NeRF-based volumetric scene representation from… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Comments: Accepted by CVPR 2024

  50. arXiv:2403.07636  [pdf, other

    cs.CV

    Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework

    Authors: Vu Minh Hieu Phan, Yutong Xie, Yuankai Qi, Lingqiao Liu, Liyang Liu, Bowen Zhang, Zhibin Liao, Qi Wu, Minh-Son To, Johan W. Verjans

    Abstract: Medical vision language pre-training (VLP) has emerged as a frontier of research, enabling zero-shot pathological recognition by comparing the query image with the textual descriptions for each disease. Due to the complex semantics of biomedical texts, current methods struggle to align medical images with key pathological findings in unstructured reports. This leads to the misalignment with the ta… ▽ More

    Submitted 31 March, 2024; v1 submitted 12 March, 2024; originally announced March 2024.

    Comments: Accepted at CVPR2024. Pre-print before final camera-ready version

    Journal ref: CVPR2024