Skip to main content

Showing 1–50 of 5,305 results for author: Li, X

  1. arXiv:2407.13274  [pdf, other

    cs.IR

    Aligning Explanations for Recommendation with Rating and Feature via Maximizing Mutual Information

    Authors: Yurou Zhao, Yiding Sun, Ruidong Han, Fei Jiang, Lu Guan, Xiang Li, Wei Lin, Jiaxin Mao

    Abstract: Providing natural language-based explanations to justify recommendations helps to improve users' satisfaction and gain users' trust. However, as current explanation generation methods are commonly trained with an objective to mimic existing user reviews, the generated explanations are often not aligned with the predicted ratings or some important features of the recommended items, and thus, are su… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Comments: this paper has been accepted by cikm2024, and the camera-ready version will be updated soon

  2. arXiv:2407.13254  [pdf, other

    cs.CV

    Make a Strong Teacher with Label Assistance: A Novel Knowledge Distillation Approach for Semantic Segmentation

    Authors: Shoumeng Qiu, Jie Chen, Xinrun Li, Ru Wan, Xiangyang Xue, Jian Pu

    Abstract: In this paper, we introduce a novel knowledge distillation approach for the semantic segmentation task. Unlike previous methods that rely on power-trained teachers or other modalities to provide additional knowledge, our approach does not require complex teacher models or information from extra sensors. Specifically, for the teacher model training, we propose to noise the label and then incorporat… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

    Journal ref: ECCV 2024

  3. arXiv:2407.13214  [pdf, other

    cs.CV

    TXL-PBC: a freely accessible labeled peripheral blood cell dataset

    Authors: Lu Gan, Xi Li

    Abstract: In a recent study, we found that publicly BCCD and BCD datasets have significant issues such as labeling errors, insufficient sample size, and poor data quality. To address these problems, we performed sample deletion, re-labeling, and integration of these two datasets. Additionally, we introduced the PBC and Raabin-WBC datasets, and ultimately created a high-quality, sample-balanced new dataset,… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  4. arXiv:2407.13182  [pdf, other

    cs.LG cs.AI q-bio.GN

    SpaDiT: Diffusion Transformer for Spatial Gene Expression Prediction using scRNA-seq

    Authors: Xiaoyu Li, Fangfang Zhu, Wenwen Min

    Abstract: The rapid development of spatial transcriptomics (ST) technologies is revolutionizing our understanding of the spatial organization of biological tissues. Current ST methods, categorized into next-generation sequencing-based (seq-based) and fluorescence in situ hybridization-based (image-based) methods, offer innovative insights into the functional dynamics of biological tissues. However, these me… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  5. arXiv:2407.13142  [pdf

    cs.CL cs.LG cs.SD eess.AS

    A light-weight and efficient punctuation and word casing prediction model for on-device streaming ASR

    Authors: Jian You, Xiangfeng Li

    Abstract: Punctuation and word casing prediction are necessary for automatic speech recognition (ASR). With the popularity of on-device end-to-end streaming ASR systems, the on-device punctuation and word casing prediction become a necessity while we found little discussion on this. With the emergence of Transformer, Transformer based models have been explored for this scenario. However, Transformer based m… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  6. arXiv:2407.13133  [pdf, other

    cs.CV

    FocusDiffuser: Perceiving Local Disparities for Camouflaged Object Detection

    Authors: Jianwei Zhao, Xin Li, Fan Yang, Qiang Zhai, Ao Luo, Zicheng Jiao, Hong Cheng

    Abstract: Detecting objects seamlessly blended into their surroundings represents a complex task for both human cognitive capabilities and advanced artificial intelligence algorithms. Currently, the majority of methodologies for detecting camouflaged objects mainly focus on utilizing discriminative models with various unique designs. However, it has been observed that generative models, such as Stable Diffu… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 18 pages,7figures

  7. arXiv:2407.13108  [pdf, other

    cs.CV

    UCIP: A Universal Framework for Compressed Image Super-Resolution using Dynamic Prompt

    Authors: Xin Li, Bingchen Li, Yeying Jin, Cuiling Lan, Hanxin Zhu, Yulin Ren, Zhibo Chen

    Abstract: Compressed Image Super-resolution (CSR) aims to simultaneously super-resolve the compressed images and tackle the challenging hybrid distortions caused by compression. However, existing works on CSR usually focuses on a single compression codec, i.e., JPEG, ignoring the diverse traditional or learning-based codecs in the practical application, e.g., HEVC, VVC, HIFIC, etc. In this work, we propose… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  8. arXiv:2407.12857  [pdf, other

    cs.CL cs.DL cs.IR

    Automated Peer Reviewing in Paper SEA: Standardization, Evaluation, and Analysis

    Authors: Jianxiang Yu, Zichen Ding, Jiaqi Tan, Kangyang Luo, Zhenmin Weng, Chenghua Gong, Long Zeng, Renjing Cui, Chengcheng Han, Qiushi Sun, Zhiyong Wu, Yunshi Lan, Xiang Li

    Abstract: In recent years, the rapid increase in scientific papers has overwhelmed traditional review mechanisms, resulting in varying quality of publications. Although existing methods have explored the capabilities of Large Language Models (LLMs) for automated scientific reviewing, their generated contents are often generic or partial. To address the issues above, we introduce an automated paper reviewing… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  9. arXiv:2407.12851  [pdf

    cs.CL

    ISPO: An Integrated Ontology of Symptom Phenotypes for Semantic Integration of Traditional Chinese Medical Data

    Authors: Zixin Shu, Rui Hua, Dengying Yan, Chenxia Lu, Ning Xu, Jun Li, Hui Zhu, Jia Zhang, Dan Zhao, Chenyang Hui, Junqiu Ye, Chu Liao, Qi Hao, Wen Ye, Cheng Luo, Xinyan Wang, Chuang Cheng, Xiaodong Li, Baoyan Liu, Xiaji Zhou, Runshun Zhang, Min Xu, Xuezhong Zhou

    Abstract: Symptom phenotypes are one of the key types of manifestations for diagnosis and treatment of various disease conditions. However, the diversity of symptom terminologies is one of the major obstacles hindering the analysis and knowledge sharing of various types of symptom-related medical data particularly in the fields of Traditional Chinese Medicine (TCM). Objective: This study aimed to construct… ▽ More

    Submitted 8 July, 2024; originally announced July 2024.

    Comments: 39 pages, 6 figures, 6 tables

  10. arXiv:2407.12827  [pdf, other

    cs.CL cs.LG

    The Solution for The PST-KDD-2024 OAG-Challenge

    Authors: Shupeng Zhong, Xinger Li, Shushan Jin, Yang Yang

    Abstract: In this paper, we introduce the second-place solution in the KDD-2024 OAG-Challenge paper source tracing track. Our solution is mainly based on two methods, BERT and GCN, and combines the reasoning results of BERT and GCN in the final submission to achieve complementary performance. In the BERT solution, we focus on processing the fragments that appear in the references of the paper, and use a var… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  11. arXiv:2407.12648  [pdf, ps, other

    cs.IT eess.SP

    Blind Beamforming for Coverage Enhancement with Intelligent Reflecting Surface

    Authors: Fan Xu, Jiawei Yao, Wenhai Lai, Kaiming Shen, Xin Li, Xin Chen, Zhi-Quan Luo

    Abstract: Conventional policy for configuring an intelligent reflecting surface (IRS) typically requires channel state information (CSI), thus incurring substantial overhead costs and facing incompatibility with the current network protocols. This paper proposes a blind beamforming strategy in the absence of CSI, aiming to boost the minimum signal-to-noise ratio (SNR) among all the receiver positions, namel… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: 17 pages

  12. arXiv:2407.12582  [pdf, other

    cs.CV cs.AI cs.RO

    Embracing Events and Frames with Hierarchical Feature Refinement Network for Object Detection

    Authors: Hu Cao, Zehua Zhang, Yan Xia, Xinyi Li, Jiahao Xia, Guang Chen, Alois Knoll

    Abstract: In frame-based vision, object detection faces substantial performance degradation under challenging conditions due to the limited sensing capability of conventional cameras. Event cameras output sparse and asynchronous events, providing a potential solution to solve these problems. However, effectively fusing two heterogeneous modalities remains an open issue. In this work, we propose a novel hier… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  13. arXiv:2407.12576  [pdf, other

    cs.AR cs.AI

    IICPilot: An Intelligent Integrated Circuit Backend Design Framework Using Open EDA

    Authors: Zesong Jiang, Qing Zhang, Cheng Liu, Huawei Li, Xiaowei Li

    Abstract: Open-source EDA tools are rapidly advancing, fostering collaboration, innovation, and knowledge sharing within the EDA community. However, the growing complexity of these tools, characterized by numerous design parameters and heuristics, poses a significant barrier to their widespread adoption. This complexity is particularly pronounced in integrated circuit (IC) backend designs, which place subst… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

    Comments: under review

  14. arXiv:2407.12575  [pdf, other

    cs.AR

    Graphitron: A Domain Specific Language for FPGA-based Graph Processing Accelerator Generation

    Authors: Xinmiao Zhang, Zheng Feng, Shengwen Liang, Xinyu Chen, Cheng Liu, Huawei Li, Xiaowei Li

    Abstract: FPGA-based graph processing accelerators, enabling extensive customization, have demonstrated significant energy efficiency over general computing engines like CPUs and GPUs. Nonetheless, customizing accelerators to diverse graph processing algorithms with distinct computational patterns remains challenging and error-prone for high-level application users. To this end, template-based approaches ha… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  15. arXiv:2407.12565  [pdf, other

    cs.AR

    SigDLA: A Deep Learning Accelerator Extension for Signal Processing

    Authors: Fangfa Fu, Wenyu Zhang, Zesong Jiang, Zhiyu Zhu, Guoyu Li, Bing Yang, Cheng Liu, Liyi Xiao, Jinxiang Wang, Huawei Li, Xiaowei Li

    Abstract: Deep learning and signal processing are closely correlated in many IoT scenarios such as anomaly detection to empower intelligence of things. Many IoT processors utilize digital signal processors (DSPs) for signal processing and build deep learning frameworks on this basis. While deep learning is usually much more computing-intensive than signal processing, the computing efficiency of deep learnin… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  16. arXiv:2407.12504  [pdf, other

    cs.CL

    Case2Code: Learning Inductive Reasoning with Synthetic Data

    Authors: Yunfan Shao, Linyang Li, Yichuan Ma, Peiji Li, Demin Song, Qinyuan Cheng, Shimin Li, Xiaonan Li, Pengyu Wang, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin

    Abstract: Complex reasoning is an impressive ability shown by large language models (LLMs). Most LLMs are skilled in deductive reasoning, such as chain-of-thought prompting or iterative tool-using to solve challenging tasks step-by-step. In this paper, we hope to focus on evaluating and teaching LLMs to conduct inductive reasoning, that is, LLMs are supposed to infer underlying rules by observing examples o… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  17. arXiv:2407.12271  [pdf, other

    cs.CV eess.IV

    RBAD: A Dataset and Benchmark for Retinal Vessels Branching Angle Detection

    Authors: Hao Wang, Wenhui Zhu, Jiayou Qin, Xin Li, Oana Dumitrascu, Xiwen Chen, Peijie Qiu, Abolfazl Razi

    Abstract: Detecting retinal image analysis, particularly the geometrical features of branching points, plays an essential role in diagnosing eye diseases. However, existing methods used for this purpose often are coarse-level and lack fine-grained analysis for efficient annotation. To mitigate these issues, this paper proposes a novel method for detecting retinal branching angles using a self-configured ima… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  18. arXiv:2407.12249  [pdf, other

    cs.IT

    Beamforming Design for Secure MC-NOMA Empowered ISAC Systems with an Active Eve

    Authors: Zhongqing Wu, Xuehua Li, Yuanxin Cai, Weijie Yuan

    Abstract: As the integrated sensing and communication(ISAC) technology emerges as a promising component of sixth generation (6G), the study of its physical layer security has become a key concern for researchers. Specifically, in this work, we focus on the security issues over a multi-carrier (MC)-non-orthogonal multiple access (NOMA) assisted ISAC system, considering imperfect channel state information (CS… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 6 pages, 5 figures, conference, This paper has been accepted by ICCC Workshops 2024

  19. arXiv:2407.12057  [pdf, other

    cs.CL cs.AI

    NinjaLLM: Fast, Scalable and Cost-effective RAG using Amazon SageMaker and AWS Trainium and Inferentia2

    Authors: Tengfei Xue, Xuefeng Li, Roman Smirnov, Tahir Azim, Arash Sadrieh, Babak Pahlavan

    Abstract: Retrieval-augmented generation (RAG) techniques are widely used today to retrieve and present information in a conversational format. This paper presents a set of enhancements to traditional RAG techniques, focusing on large language models (LLMs) fine-tuned and hosted on AWS Trainium and Inferentia2 AI chips via SageMaker. These chips are characterized by their elasticity, affordability, and effi… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    ACM Class: I.2.7

  20. arXiv:2407.12037  [pdf, other

    cs.AR cs.SE

    A Novel HDL Code Generator for Effectively Testing FPGA Logic Synthesis Compilers

    Authors: Zhihao Xu, Shikai Guo, Guilin Zhao, Peiyu Zou, Xiaochen Li, He Jiang

    Abstract: Field Programmable Gate Array (FPGA) logic synthesis compilers (e.g., Vivado, Iverilog, Yosys, and Quartus) are widely applied in Electronic Design Automation (EDA), such as the development of FPGA programs.However, defects (i.e., incorrect synthesis) in logic synthesis compilers may lead to unexpected behaviors in target applications, posing security risks. Therefore, it is crucial to thoroughly… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

  21. arXiv:2407.12019  [pdf, other

    cs.CL cs.AI

    DIM: Dynamic Integration of Multimodal Entity Linking with Large Language Model

    Authors: Shezheng Song, Shasha Li, Jie Yu, Shan Zhao, Xiaopeng Li, Jun Ma, Xiaodong Liu, Zhuo Li, Xiaoguang Mao

    Abstract: Our study delves into Multimodal Entity Linking, aligning the mention in multimodal information with entities in knowledge base. Existing methods are still facing challenges like ambiguous entity representations and limited image information utilization. Thus, we propose dynamic entity extraction using ChatGPT, which dynamically extracts entities and enhances datasets. We also propose a method: Dy… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

    Comments: Published on PRCV24

  22. arXiv:2407.11812  [pdf, ps, other

    cs.LG q-bio.QM

    DFDRNN: A dual-feature based neural network for drug repositioning

    Authors: Enqiang Zhu, Xiang Li, Chanjuan Liu, Nikhil R. Pal

    Abstract: Drug repositioning is an economically efficient strategy used to discover new indications for existing drugs beyond their original approvals, expanding their applicability and usage to address challenges in disease treatment. In recent years, deep-learning techniques for drug repositioning have gained much attention. While most deep learning-based research methods focus on encoding drugs and disea… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  23. arXiv:2407.11663  [pdf, other

    cs.CV

    Affective Behavior Analysis using Task-adaptive and AU-assisted Graph Network

    Authors: Xiaodong Li, Wenchao Du, Hongyu Yang

    Abstract: In this paper, we present our solution and experiment result for the Multi-Task Learning Challenge of the 7th Affective Behavior Analysis in-the-wild(ABAW7) Competition. This challenge consists of three tasks: action unit detection, facial expression recognition, and valance-arousal estimation. We address the research problems of this challenge from three aspects: 1)For learning robust visual feat… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  24. arXiv:2407.11654  [pdf, other

    cs.LG cs.AI eess.SP

    R-SFLLM: Jamming Resilient Framework for Split Federated Learning with Large Language Models

    Authors: Aladin Djuhera, Vlad C. Andrei, Xinyang Li, Ullrich J. Mönich, Holger Boche, Walid Saad

    Abstract: Split federated learning (SFL) is a compute-efficient paradigm in distributed machine learning (ML), where components of large ML models are outsourced to remote servers. A significant challenge in SFL, particularly when deployed over wireless channels, is the susceptibility of transmitted model parameters to adversarial jamming that could jeopardize the learning process. This is particularly pron… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  25. arXiv:2407.11644  [pdf, other

    cs.CV cs.RO

    Perception Helps Planning: Facilitating Multi-Stage Lane-Level Integration via Double-Edge Structures

    Authors: Guoliang You, Xiaomeng Chu, Yifan Duan, Wenyu Zhang, Xingchen Li, Sha Zhang, Yao Li, Jianmin Ji, Yanyong Zhang

    Abstract: When planning for autonomous driving, it is crucial to consider essential traffic elements such as lanes, intersections, traffic regulations, and dynamic agents. However, they are often overlooked by the traditional end-to-end planning methods, likely leading to inefficiencies and non-compliance with traffic regulations. In this work, we endeavor to integrate the perception of these elements into… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  26. arXiv:2407.11420  [pdf, other

    cs.RO

    iKalibr: Unified Targetless Spatiotemporal Calibration for Resilient Integrated Inertial Systems

    Authors: Shuolong Chen, Xingxing Li, Shengyu Li, Yuxuan Zhou, Xiaoteng Yang

    Abstract: The integrated inertial system, typically integrating an IMU and an exteroceptive sensor such as radar, LiDAR, and camera, has been widely accepted and applied in modern robotic applications for ego-motion estimation, motion control, or autonomous exploration. To improve system accuracy, robustness, and further usability, both multiple and various sensors are generally resiliently integrated, whic… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

  27. arXiv:2407.11324  [pdf, other

    cs.AR

    ApproxPilot: A GNN-based Accelerator Approximation Framework

    Authors: Qing Zhang, Cheng Liu, Siting Liu, Yajuan Hui, Huawei Li, Xiaowei Li

    Abstract: A typical optimization of customized accelerators for error-tolerant applications such as multimedia, recognition, and classification is to replace traditional arithmetic units like multipliers and adders with the approximate ones to enhance energy efficiency while adhering to accuracy requirements. However, the plethora of arithmetic units and diverse approximate unit options result in an exceedi… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  28. arXiv:2407.11299  [pdf, other

    cs.RO cs.CV

    FR-SLAM: A SLAM Improvement Method Based on Floor Plan Registration

    Authors: Jiantao Feng, Xinde Li, HyunCheol Park, Juan Liu, Zhentong Zhang

    Abstract: Simultaneous Localization and Mapping (SLAM) technology enables the construction of environmental maps and localization, serving as a key technique for indoor autonomous navigation of mobile robots. Traditional SLAM methods typically require exhaustive traversal of all rooms during indoor navigation to obtain a complete map, resulting in lengthy path planning times and prolonged time to reach targ… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  29. arXiv:2407.11034  [pdf

    cs.LG

    Bridging Data Gaps in Healthcare: A Scoping Review of Transfer Learning in Biomedical Data Analysis

    Authors: Siqi Li, Xin Li, Kunyu Yu, Di Miao, Mingcheng Zhu, Mengying Yan, Yuhe Ke, Danny D'Agostino, Yilin Ning, Qiming Wu, Ziwen Wang, Yuqing Shang, Molei Liu, Chuan Hong, Nan Liu

    Abstract: Clinical and biomedical research in low-resource settings often faces significant challenges due to the need for high-quality data with sufficient sample sizes to construct effective models. These constraints hinder robust model training and prompt researchers to seek methods for leveraging existing knowledge from related studies to support new research efforts. Transfer learning (TL), a machine l… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

  30. arXiv:2407.10990  [pdf

    cs.CL cs.AI

    MedBench: A Comprehensive, Standardized, and Reliable Benchmarking System for Evaluating Chinese Medical Large Language Models

    Authors: Mianxin Liu, Jinru Ding, Jie Xu, Weiguo Hu, Xiaoyang Li, Lifeng Zhu, Zhian Bai, Xiaoming Shi, Benyou Wang, Haitao Song, Pengfei Liu, Xiaofan Zhang, Shanshan Wang, Kang Li, Haofen Wang, Tong Ruan, Xuanjing Huang, Xin Sun, Shaoting Zhang

    Abstract: Ensuring the general efficacy and goodness for human beings from medical large language models (LLM) before real-world deployment is crucial. However, a widely accepted and accessible evaluation process for medical LLM, especially in the Chinese context, remains to be established. In this work, we introduce "MedBench", a comprehensive, standardized, and reliable benchmarking system for Chinese med… ▽ More

    Submitted 23 June, 2024; originally announced July 2024.

    Comments: 25 pages.4 figures

  31. arXiv:2407.10979  [pdf, ps, other

    cs.NI

    Diffusion Model-based Incentive Mechanism with Prospect Theory for Edge AIGC Services in 6G IoT

    Authors: Jinbo Wen, Jiangtian Nie, Yue Zhong, Changyan Yi, Xiaohuan Li, Jiangming Jin, Yang Zhang, Dusit Niyato

    Abstract: The fusion of Internet of Things (IoT) with Sixth-Generation (6G) technology has significant potential to revolutionize the IoT landscape. Utilizing the ultra-reliable and low-latency communication capabilities of 6G, 6G-IoT networks can transmit high-quality and diverse data to enhance edge learning. Artificial Intelligence-Generated Content (AIGC) harnesses advanced AI algorithms to automaticall… ▽ More

    Submitted 10 June, 2024; originally announced July 2024.

  32. arXiv:2407.10918  [pdf, other

    cs.CV

    PartImageNet++ Dataset: Scaling up Part-based Models for Robust Recognition

    Authors: Xiao Li, Yining Liu, Na Dong, Sitian Qin, Xiaolin Hu

    Abstract: Deep learning-based object recognition systems can be easily fooled by various adversarial perturbations. One reason for the weak robustness may be that they do not have part-based inductive bias like the human recognition process. Motivated by this, several part-based recognition models have been proposed to improve the adversarial robustness of recognition. However, due to the lack of part annot… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV2024

  33. arXiv:2407.10833  [pdf, other

    eess.IV cs.CV

    MoE-DiffIR: Task-customized Diffusion Priors for Universal Compressed Image Restoration

    Authors: Yulin Ren, Xin Li, Bingchen Li, Xingrui Wang, Mengxi Guo, Shijie Zhao, Li Zhang, Zhibo Chen

    Abstract: We present MoE-DiffIR, an innovative universal compressed image restoration (CIR) method with task-customized diffusion priors. This intends to handle two pivotal challenges in the existing CIR methods: (i) lacking adaptability and universality for different image codecs, e.g., JPEG and WebP; (ii) poor texture generation capability, particularly at low bitrates. Specifically, our MoE-DiffIR develo… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

    Comments: Accepted by ECCV 2024

  34. arXiv:2407.10408  [pdf, other

    cs.IT eess.SP

    Latency Minimization for IRS-enhanced Wideband MEC Networks with Practical Reflection Model

    Authors: N. Li, W. Hao, X. Li, Z. Zhu, Z. Tang, S. Yang

    Abstract: Intelligent reflecting surface (IRS) has been considered as an efficient way to boost the computation capability of mobile edge computing (MEC) system, especially when the communication links is blocked or the communication signal is weak. However, most existing works are restricted to narrow-band channel and ideal IRS reflection model, which is not practical and may lead to significant performanc… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 13 pages, 9 figures

  35. arXiv:2407.10327  [pdf, other

    cs.LG cs.AI cs.CV

    Learning Unlabeled Clients Divergence via Anchor Model Aggregation for Federated Semi-supervised Learning

    Authors: Marawan Elbatel, Hualiang Wang, Jixiang Chen, Hao Wang, Xiaomeng Li

    Abstract: Federated semi-supervised learning (FedSemi) refers to scenarios where there may be clients with fully labeled data, clients with partially labeled, and even fully unlabeled clients while preserving data privacy. However, challenges arise from client drift due to undefined heterogeneous class distributions and erroneous pseudo-labels. Existing FedSemi methods typically fail to aggregate models fro… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  36. arXiv:2407.10274  [pdf, other

    cs.CV cs.LG

    Enhancing Weakly-Supervised Histopathology Image Segmentation with Knowledge Distillation on MIL-Based Pseudo-Labels

    Authors: Yinsheng He, Xingyu Li, Roger J. Zemp

    Abstract: Segmenting tumors in histological images is vital for cancer diagnosis. While fully supervised models excel with pixel-level annotations, creating such annotations is labor-intensive and costly. Accurate histopathology image segmentation under weakly-supervised conditions with coarse-grained image labels is still a challenging problem. Although multiple instance learning (MIL) has shown promise in… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

  37. arXiv:2407.10204  [pdf, other

    cs.LG

    Improving Graph Out-of-distribution Generalization on Real-world Data

    Authors: Can Xu, Yao Cheng, Jianxiang Yu, Haosen Wang, Jingsong Lv, Xiang Li

    Abstract: Existing methods for graph out-of-distribution (OOD) generalization primarily rely on empirical studies on synthetic datasets. Such approaches tend to overemphasize the causal relationships between invariant sub-graphs and labels, thereby neglecting the non-negligible role of environment in real-world scenarios. In contrast to previous studies that impose rigid independence assumptions on environm… ▽ More

    Submitted 14 July, 2024; originally announced July 2024.

    Comments: 21 pages, 5 figures

  38. arXiv:2407.09540  [pdf, other

    eess.IV cs.CE cs.CV cs.LG q-bio.TO

    Prompting Whole Slide Image Based Genetic Biomarker Prediction

    Authors: Ling Zhang, Boxiang Yun, Xingran Xie, Qingli Li, Xinxing Li, Yan Wang

    Abstract: Prediction of genetic biomarkers, e.g., microsatellite instability and BRAF in colorectal cancer is crucial for clinical decision making. In this paper, we propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques. Our work aims at addressing the following challenges: (1) extracting foreground instances related to genetic biomarkers from gigapixel WSIs, a… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures, MICCAI2024

  39. arXiv:2407.09488  [pdf, other

    q-bio.NC cs.LG cs.NE

    Manifold Learning via Memory and Context

    Authors: Xin Li

    Abstract: Given a memory with infinite capacity, can we solve the learning problem? Apparently, nature has solved this problem as evidenced by the evolution of mammalian brains. Inspired by the organizational principles underlying hippocampal-neocortical systems, we present a navigation-based approach to manifold learning using memory and context. The key insight is to navigate on the manifold and memorize… ▽ More

    Submitted 17 May, 2024; originally announced July 2024.

  40. arXiv:2407.09088  [pdf, other

    eess.IV cs.AI cs.CV

    FD-SOS: Vision-Language Open-Set Detectors for Bone Fenestration and Dehiscence Detection from Intraoral Images

    Authors: Marawan Elbatel, Keyuan Liu, Yanqi Yang, Xiaomeng Li

    Abstract: Accurate detection of bone fenestration and dehiscence (FD) is crucial for effective treatment planning in dentistry. While cone-beam computed tomography (CBCT) is the gold standard for evaluating FD, it comes with limitations such as radiation exposure, limited accessibility, and higher cost compared to intraoral images. In intraoral images, dentists face challenges in the differential diagnosis… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

    Comments: MICCAI 2024

  41. arXiv:2407.08944  [pdf, other

    cs.CV eess.IV

    Bora: Biomedical Generalist Video Generation Model

    Authors: Weixiang Sun, Xiaocao You, Ruizhe Zheng, Zhengqing Yuan, Xiang Li, Lifang He, Quanzheng Li, Lichao Sun

    Abstract: Generative models hold promise for revolutionizing medical education, robot-assisted surgery, and data augmentation for medical AI development. Diffusion models can now generate realistic images from text prompts, while recent advancements have demonstrated their ability to create diverse, high-quality videos. However, these models often struggle with generating accurate representations of medical… ▽ More

    Submitted 15 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  42. arXiv:2407.08903  [pdf, other

    cs.CR cs.AI cs.AR

    TensorTEE: Unifying Heterogeneous TEE Granularity for Efficient Secure Collaborative Tensor Computing

    Authors: Husheng Han, Xinyao Zheng, Yuanbo Wen, Yifan Hao, Erhu Feng, Ling Liang, Jianan Mu, Xiaqing Li, Tianyun Ma, Pengwei Jin, Xinkai Song, Zidong Du, Qi Guo, Xing Hu

    Abstract: Heterogeneous collaborative computing with NPU and CPU has received widespread attention due to its substantial performance benefits. To ensure data confidentiality and integrity during computing, Trusted Execution Environments (TEE) is considered a promising solution because of its comparatively lower overhead. However, existing heterogeneous TEE designs are inefficient for collaborative computin… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: Accepted by ASPLOS 2024

  43. arXiv:2407.08516  [pdf, other

    cs.AI

    Converging Paradigms: The Synergy of Symbolic and Connectionist AI in LLM-Empowered Autonomous Agents

    Authors: Haoyi Xiong, Zhiyuan Wang, Xuhong Li, Jiang Bian, Zeke Xie, Shahid Mumtaz, Laura E. Barnes

    Abstract: This article explores the convergence of connectionist and symbolic artificial intelligence (AI), from historical debates to contemporary advancements. Traditionally considered distinct paradigms, connectionist AI focuses on neural networks, while symbolic AI emphasizes symbolic representation and logic. Recent advancements in large language models (LLMs), exemplified by ChatGPT and GPT-4, highlig… ▽ More

    Submitted 15 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

  44. arXiv:2407.08351  [pdf, other

    cs.CL cs.LG

    AutoBencher: Creating Salient, Novel, Difficult Datasets for Language Models

    Authors: Xiang Lisa Li, Evan Zheran Liu, Percy Liang, Tatsunori Hashimoto

    Abstract: Evaluation is critical for assessing capabilities, tracking scientific progress, and informing model selection. In this paper, we present three desiderata for a good benchmark for language models: (i) salience (e.g., knowledge about World War II is more salient than a random day in history), (ii) novelty (i.e., the benchmark reveals new trends in model rankings not shown by previous benchmarks), a… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: preprint

  45. arXiv:2407.08303  [pdf, other

    cs.CV cs.AI

    DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception

    Authors: Xiaotong Li, Fan Zhang, Haiwen Diao, Yueze Wang, Xinlong Wang, Ling-Yu Duan

    Abstract: Existing Multimodal Large Language Models (MLLMs) increasingly emphasize complex understanding of various visual elements, including multiple objects, text information, and spatial relations. Their development for comprehensive visual perception hinges on the availability of high-quality image-text datasets that offer diverse visual elements and throughout image descriptions. However, the scarcity… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  46. arXiv:2407.08273   

    cs.CL

    RB-SQL: A Retrieval-based LLM Framework for Text-to-SQL

    Authors: Zhenhe Wu, Zhongqiu Li, Jie Zhang, Mengxiang Li, Yu Zhao, Ruiyu Fang, Zhongjiang He, Xuelong Li, Zhoujun Li, Shuangyong Song

    Abstract: Large language models (LLMs) with in-context learning have significantly improved the performance of text-to-SQL task. Previous works generally focus on using exclusive SQL generation prompt to improve the LLMs' reasoning ability. However, they are mostly hard to handle large databases with numerous tables and columns, and usually ignore the significance of pre-processing database and extracting v… ▽ More

    Submitted 12 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

    Comments: Further improvement and modification are needed.

  47. arXiv:2407.08154  [pdf, other

    cs.CE

    Bayesian uncertainty analysis for underwater 3D reconstruction with neural radiance fields

    Authors: Haojie Lian, Xinhao Li, Yilin Qu, Jing Du, Zhuxuan Meng, Jie Liu, Leilei Chen

    Abstract: Neural radiance fields (NeRFs) are a deep learning technique that can generate novel views of 3D scenes using sparse 2D images from different viewing directions and camera poses. As an extension of conventional NeRFs in underwater environment, where light can get absorbed and scattered by water, SeaThru-NeRF was proposed to separate the clean appearance and geometric structure of underwater scene… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

  48. arXiv:2407.08101  [pdf, other

    cs.CV

    Live Fitness Coaching as a Testbed for Situated Interaction

    Authors: Sunny Panchal, Apratim Bhattacharyya, Guillaume Berger, Antoine Mercier, Cornelius Bohm, Florian Dietrichkeit, Reza Pourreza, Xuanlin Li, Pulkit Madan, Mingu Lee, Mark Todorovich, Ingo Bax, Roland Memisevic

    Abstract: Tasks at the intersection of vision and language have had a profound impact in advancing the capabilities of vision-language models such as dialog-based assistants. However, models trained on existing tasks are largely limited to turn-based interactions, where each turn must be stepped (i.e., prompted) by the user. Open-ended, asynchronous interactions where an AI model may proactively deliver tim… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: The benchmark and dataset is available here: https://developer.qualcomm.com/software/ai-datasets/qevd

  49. arXiv:2407.07763  [pdf, other

    cs.CV

    S&D Messenger: Exchanging Semantic and Domain Knowledge for Generic Semi-Supervised Medical Image Segmentation

    Authors: Qixiang Zhang, Haonan Wang, Xiaomeng Li

    Abstract: Semi-supervised medical image segmentation (SSMIS) has emerged as a promising solution to tackle the challenges of time-consuming manual labeling in the medical field. However, in practical scenarios, there are often domain variations within the datasets, leading to derivative scenarios like semi-supervised medical domain generalization (Semi-MDG) and unsupervised medical domain adaptation (UMDA).… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 10 pages, under review of IEEE Transcations on Medical Imaging

  50. arXiv:2407.07760  [pdf, other

    cs.CV cs.AI

    Learning Spatial-Semantic Features for Robust Video Object Segmentation

    Authors: Xin Li, Deshui Miao, Zhenyu He, Yaowei Wang, Huchuan Lu, Ming-Hsuan Yang

    Abstract: Tracking and segmenting multiple similar objects with complex or separate parts in long-term videos is inherently challenging due to the ambiguity of target parts and identity confusion caused by occlusion, background clutter, and long-term variations. In this paper, we propose a robust video object segmentation framework equipped with spatial-semantic features and discriminative object queries to… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Winner solution of the VOTS2024 Challenge