Skip to main content

Showing 1–50 of 110 results for author: Zhang, H

  1. arXiv:2407.13118  [pdf, other

    q-bio.NC stat.CO

    Evaluating the evolution and inter-individual variability of infant functional module development from 0 to 5 years old

    Authors: Lingbin Bian, Nizhuan Wang, Yuanning Li, Adeel Razi, Qian Wang, Han Zhang, Dinggang Shen, the UNC/UMN Baby Connectome Project Consortium

    Abstract: The segregation and integration of infant brain networks undergo tremendous changes due to the rapid development of brain function and organization. Traditional methods for estimating brain modularity usually rely on group-averaged functional connectivity (FC), often overlooking individual variability. To address this, we introduce a novel approach utilizing Bayesian modeling to analyze the dynami… ▽ More

    Submitted 17 July, 2024; originally announced July 2024.

  2. arXiv:2407.11734  [pdf, other

    q-bio.QM cs.LG q-bio.GN

    Generating Multi-Modal and Multi-Attribute Single-Cell Counts with CFGen

    Authors: Alessandro Palma, Till Richter, Hanyi Zhang, Manuel Lubetzki, Alexander Tong, Andrea Dittadi, Fabian Theis

    Abstract: Generative modeling of single-cell RNA-seq data has shown invaluable potential in community-driven tasks such as trajectory inference, batch effect removal and gene expression generation. However, most recent deep models generating synthetic single cells from noise operate on pre-processed continuous gene expression approximations, ignoring the inherently discrete and over-dispersed nature of sing… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: 28 pages, 12 figures

  3. arXiv:2405.18658  [pdf, other

    q-bio.NC cs.AI

    D-CoRP: Differentiable Connectivity Refinement for Functional Brain Networks

    Authors: Haoyu Hu, Hongrun Zhang, Chao Li

    Abstract: Brain network is an important tool for understanding the brain, offering insights for scientific research and clinical diagnosis. Existing models for brain networks typically primarily focus on brain regions or overlook the complexity of brain connectivities. MRI-derived brain network data is commonly susceptible to connectivity noise, underscoring the necessity of incorporating connectivities int… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

  4. arXiv:2405.14545  [pdf, other

    q-bio.BM cs.LG

    A Cross-Field Fusion Strategy for Drug-Target Interaction Prediction

    Authors: Hongzhi Zhang, Xiuwen Gong, Shirui Pan, Jia Wu, Bo Du, Wenbin Hu

    Abstract: Drug-target interaction (DTI) prediction is a critical component of the drug discovery process. In the drug development engineering field, predicting novel drug-target interactions is extremely crucial.However, although existing methods have achieved high accuracy levels in predicting known drugs and drug targets, they fail to utilize global protein information during DTI prediction. This leads to… ▽ More

    Submitted 23 May, 2024; originally announced May 2024.

  5. arXiv:2404.19230  [pdf

    q-bio.BM cs.AI

    Deep Lead Optimization: Leveraging Generative AI for Structural Modification

    Authors: Odin Zhang, Haitao Lin, Hui Zhang, Huifeng Zhao, Yufei Huang, Yuansheng Huang, Dejun Jiang, Chang-yu Hsieh, Peichen Pan, Tingjun Hou

    Abstract: The idea of using deep-learning-based molecular generation to accelerate discovery of drug candidates has attracted extraordinary attention, and many deep generative models have been developed for automated drug design, termed molecular generation. In general, molecular generation encompasses two main strategies: de novo design, which generates novel molecular structures from scratch, and lead opt… ▽ More

    Submitted 29 April, 2024; originally announced April 2024.

  6. arXiv:2403.20097  [pdf, other

    cs.AI cs.HC q-bio.NC

    ITCMA: A Generative Agent Based on a Computational Consciousness Structure

    Authors: Hanzhong Zhang, Jibin Yin, Haoyang Wang, Ziwei Xiang

    Abstract: Large Language Models (LLMs) still face challenges in tasks requiring understanding implicit instructions and applying common-sense knowledge. In such scenarios, LLMs may require multiple attempts to achieve human-level performance, potentially leading to inaccurate responses or inferences in practical environments, affecting their long-term consistency and behavior. This paper introduces the Inte… ▽ More

    Submitted 8 June, 2024; v1 submitted 29 March, 2024; originally announced March 2024.

    Comments: 20 pages, 11 figures

    ACM Class: I.2; J.4

  7. arXiv:2403.09560  [pdf, other

    cs.LG physics.chem-ph q-bio.BM

    Self-Consistency Training for Density-Functional-Theory Hamiltonian Prediction

    Authors: He Zhang, Chang Liu, Zun Wang, Xinran Wei, Siyuan Liu, Nanning Zheng, Bin Shao, Tie-Yan Liu

    Abstract: Predicting the mean-field Hamiltonian matrix in density functional theory is a fundamental formulation to leverage machine learning for solving molecular science problems. Yet, its applicability is limited by insufficient labeled data for training. In this work, we highlight that Hamiltonian prediction possesses a self-consistency principle, based on which we propose self-consistency training, an… ▽ More

    Submitted 5 June, 2024; v1 submitted 14 March, 2024; originally announced March 2024.

    Comments: Accepted by ICML 2024

  8. arXiv:2403.06792  [pdf

    q-bio.BM q-bio.TO

    Study of the mechanism of electroacupuncture regulating ferroptosis, inhibiting bladder neck fibrosis, and improving bladder urination function after suprasacral spinal cord injury using proteomics

    Authors: Jin-Can Liu, Li-Ya Tang, Xiao-Ying Sun, Qi-Rui Qu, Qiong Liu, Lu Zhou, Hong Zhang, Bruce Song, Ming Xu, Kun Ai

    Abstract: Purpose The aim of this study was to explore whether electroacupuncture regulates phenotypic transformation of smooth muscle cells by inhibiting ferroptosis and inhibiting fibrosis, thereby improving bladder urination function after suprasacral spinal cord injury (SSCI). Methods The experiment was divided into sham, model, and electroacupuncture group. After 10 days of electroacupuncture intervent… ▽ More

    Submitted 11 March, 2024; originally announced March 2024.

  9. arXiv:2401.07657  [pdf, other

    cs.LG cs.CE q-bio.BM

    Empirical Evidence for the Fragment level Understanding on Drug Molecular Structure of LLMs

    Authors: Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang

    Abstract: AI for drug discovery has been a research hotspot in recent years, and SMILES-based language models has been increasingly applied in drug molecular design. However, no work has explored whether and how language models understand the chemical spatial structure from 1D sequences. In this work, we pre-train a transformer model on chemical language and fine-tune it toward drug design objectives, and i… ▽ More

    Submitted 15 January, 2024; originally announced January 2024.

    Comments: Accepted by AAAI 2024 workshop: Large Language Models for Biological Discoveries (LLMs4Bio)

  10. arXiv:2401.06155  [pdf, other

    q-bio.BM cs.CE cs.LG

    De novo Drug Design using Reinforcement Learning with Multiple GPT Agents

    Authors: Xiuyuan Hu, Guoqing Liu, Yang Zhao, Hao Zhang

    Abstract: De novo drug design is a pivotal issue in pharmacology and a new area of focus in AI for science research. A central challenge in this field is to generate molecules with specific properties while also producing a wide range of diverse candidates. Although advanced technologies such as transformer models and reinforcement learning have been applied in drug design, their potential has not been full… ▽ More

    Submitted 21 December, 2023; originally announced January 2024.

    Comments: Accepted by NeurIPS 2023

  11. arXiv:2310.19849  [pdf, other

    q-bio.BM cs.LG q-bio.QM

    Predicting mutational effects on protein-protein binding via a side-chain diffusion probabilistic model

    Authors: Shiwei Liu, Tian Zhu, Milong Ren, Chungong Yu, Dongbo Bu, Haicang Zhang

    Abstract: Many crucial biological processes rely on networks of protein-protein interactions. Predicting the effect of amino acid mutations on protein-protein binding is vital in protein engineering and therapeutic discovery. However, the scarcity of annotated experimental data on binding energy poses a significant challenge for developing computational approaches, particularly deep learning-based methods.… ▽ More

    Submitted 30 October, 2023; originally announced October 2023.

  12. arXiv:2310.12035  [pdf

    cs.HC q-bio.NC

    Tracking dynamic flow: Decoding flow fluctuations through performance in a fine motor control task

    Authors: Bohao Tian, Shijun Zhang, Sirui Chen, Yuru Zhang, Kaiping Peng, Hongxing Zhang, Dangxiao Wang

    Abstract: Flow, an optimal mental state merging action and awareness, significantly impacts our emotion, performance, and well-being. However, capturing its swift fluctuations on a fine timescale is challenging due to the sparsity of the existing flow detecting tools. Here we present a fine fingertip force control (F3C) task to induce flow, wherein the task challenge is set at a compatible level with person… ▽ More

    Submitted 28 December, 2023; v1 submitted 18 October, 2023; originally announced October 2023.

  13. arXiv:2309.14954  [pdf, other

    q-bio.BM cs.AI

    Addressing preferred orientation in single-particle cryo-EM through AI-generated auxiliary particles

    Authors: Hui Zhang, Dihan Zheng, Qiurong Wu, Nieng Yan, Zuoqiang Shi, Mingxu Hu, Chenglong Bao

    Abstract: The single-particle cryo-EM field faces the persistent challenge of preferred orientation, lacking general computational solutions. We introduce cryoPROS, an AI-based approach designed to address the above issue. By generating the auxiliary particles with a conditional deep generative model, cryoPROS addresses the intrinsic bias in orientation estimation for the observed particles. We effectively… ▽ More

    Submitted 26 September, 2023; originally announced September 2023.

  14. arXiv:2309.10008  [pdf

    q-bio.MN cs.LG

    DeepHEN: quantitative prediction essential lncRNA genes and rethinking essentialities of lncRNA genes

    Authors: Hanlin Zhang, Wenzheng Cheng

    Abstract: Gene essentiality refers to the degree to which a gene is necessary for the survival and reproductive efficacy of a living organism. Although the essentiality of non-coding genes has been documented, there are still aspects of non-coding genes' essentiality that are unknown to us. For example, We do not know the contribution of sequence features and network spatial features to essentiality. As a c… ▽ More

    Submitted 17 September, 2023; originally announced September 2023.

  15. arXiv:2308.11927  [pdf, other

    q-bio.QM cs.CV eess.IV

    Recovering a Molecule's 3D Dynamics from Liquid-phase Electron Microscopy Movies

    Authors: Enze Ye, Yuhang Wang, Hong Zhang, Yiqin Gao, Huan Wang, He Sun

    Abstract: The dynamics of biomolecules are crucial for our understanding of their functioning in living systems. However, current 3D imaging techniques, such as cryogenic electron microscopy (cryo-EM), require freezing the sample, which limits the observation of their conformational changes in real time. The innovative liquid-phase electron microscopy (liquid-phase EM) technique allows molecules to be place… ▽ More

    Submitted 23 August, 2023; originally announced August 2023.

  16. arXiv:2308.02172  [pdf

    q-bio.BM

    Delete: Deep Lead Optimization Enveloped in Protein Pocket through Unified Deleting Strategies and a Structure-aware Network

    Authors: Haotian Zhang, Huifeng Zhao, Xujun Zhang, Qun Su, Hongyan Du, Chao Shen, Zhe Wang, Dan Li, Peichen Pan, Guangyong Chen, Yu Kang, Chang-yu Hsieh, Tingjun Hou

    Abstract: Drug discovery is a highly complicated process, and it is unfeasible to fully commit it to the recently developed molecular generation methods. Deep learning-based lead optimization takes expert knowledge as a starting point, learning from numerous historical cases about how to modify the structure for better drug-forming properties. However, compared with the more established de novo generation s… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  17. arXiv:2307.09580  [pdf, other

    q-bio.BM cs.DS q-bio.GN

    LinearSankoff: Linear-time Simultaneous Folding and Alignment of RNA Homologs

    Authors: Sizhen Li, Ning Dai, He Zhang, Apoorv Malik, David H. Mathews, Liang Huang

    Abstract: The classical Sankoff algorithm for the simultaneous folding and alignment of homologous RNA sequences is highly influential, but it suffers from two major limitations in efficiency and modeling power. First, it takes $O(n^6)$ for two sequences where n is the average sequence length. Most implementations and variations reduce the runtime to $O(n^3)$ by restricting the alignment search space, but t… ▽ More

    Submitted 18 July, 2023; originally announced July 2023.

  18. arXiv:2307.07072  [pdf, other

    cs.LG cs.CE eess.IV q-bio.QM stat.ML

    Rician likelihood loss for quantitative MRI using self-supervised deep learning

    Authors: Christopher S. Parker, Anna Schroder, Sean C. Epstein, James Cole, Daniel C. Alexander, Hui Zhang

    Abstract: Purpose: Previous quantitative MR imaging studies using self-supervised deep learning have reported biased parameter estimates at low SNR. Such systematic errors arise from the choice of Mean Squared Error (MSE) loss function for network training, which is incompatible with Rician-distributed MR magnitude signals. To address this issue, we introduce the negative log Rician likelihood (NLR) loss. M… ▽ More

    Submitted 13 July, 2023; originally announced July 2023.

    Comments: 16 pages, 6 figures

  19. arXiv:2306.05445  [pdf, other

    physics.chem-ph cs.LG q-bio.BM

    Towards Predicting Equilibrium Distributions for Molecular Systems with Deep Learning

    Authors: Shuxin Zheng, Jiyan He, Chang Liu, Yu Shi, Ziheng Lu, Weitao Feng, Fusong Ju, Jiaxi Wang, Jianwei Zhu, Yaosen Min, He Zhang, Shidi Tang, Hongxia Hao, Peiran Jin, Chi Chen, Frank Noé, Haiguang Liu, Tie-Yan Liu

    Abstract: Advances in deep learning have greatly improved structure prediction of molecules. However, many macroscopic observations that are important for real-world applications are not functions of a single molecular structure, but rather determined from the equilibrium distribution of structures. Traditional methods for obtaining these distributions, such as molecular dynamics simulation, are computation… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

    Comments: 80 pages, 11 figures

  20. arXiv:2306.00041  [pdf, other

    q-bio.QM cs.LG

    Causal Intervention for Measuring Confidence in Drug-Target Interaction Prediction

    Authors: Wenting Ye, Chen Li, Yang Xie, Wen Zhang, Hong-Yu Zhang, Bowen Wang, Debo Cheng, Zaiwen Feng

    Abstract: Identifying and discovering drug-target interactions(DTIs) are vital steps in drug discovery and development. They play a crucial role in assisting scientists in finding new drugs and accelerating the drug development process. Recently, knowledge graph and knowledge graph embedding (KGE) models have made rapid advancements and demonstrated impressive performance in drug discovery. However, such mo… ▽ More

    Submitted 14 November, 2023; v1 submitted 31 May, 2023; originally announced June 2023.

  21. arXiv:2305.10466  [pdf

    q-bio.QM cs.AI eess.IV

    Solitary pulmonary nodules prediction for lung cancer patients using nomogram and machine learning

    Authors: Hailan Zhang, Gongjin Song

    Abstract: Lung cancer(LC) is a type of malignant neoplasm that originates in the bronchial mucosa or glands.As a clinically common nodule,solitary pulmonary nodules(SPNs) have a significantly higher probability of malignancy when they are larger than 8 mm in diameter.But there is also a risk of lung cancer when the diameter is less than 8mm,the purpose of this study was to create a nomogram for estimating t… ▽ More

    Submitted 17 May, 2023; originally announced May 2023.

  22. arXiv:2305.03061  [pdf, other

    eess.IV q-bio.NC

    Mining fMRI Dynamics with Parcellation Prior for Brain Disease Diagnosis

    Authors: Xiaozhao Liu, Mianxin Liu, Lang Mei, Yuyao Zhang, Feng Shi, Han Zhang, Dinggang Shen

    Abstract: To characterize atypical brain dynamics under diseases, prevalent studies investigate functional magnetic resonance imaging (fMRI). However, most of the existing analyses compress rich spatial-temporal information as the brain functional networks (BFNs) and directly investigate the whole-brain network without neurological priors about functional subnetworks. We thus propose a novel graph learning… ▽ More

    Submitted 4 May, 2023; originally announced May 2023.

    Comments: 5 pages, 2 figures, conference paper, accepted by IEEE International Symposium on Biomedical Imaging (ISBI) 2023

  23. arXiv:2304.10494  [pdf

    q-bio.BM cs.AI cs.LG

    Infinite Physical Monkey: Do Deep Learning Methods Really Perform Better in Conformation Generation?

    Authors: Haotian Zhang, Jintu Zhang, Huifeng Zhao, Dejun Jiang, Yafeng Deng

    Abstract: Conformation Generation is a fundamental problem in drug discovery and cheminformatics. And organic molecule conformation generation, particularly in vacuum and protein pocket environments, is most relevant to drug design. Recently, with the development of geometric neural networks, the data-driven schemes have been successfully applied in this field, both for molecular conformation generation (in… ▽ More

    Submitted 7 March, 2023; originally announced April 2023.

  24. arXiv:2304.09285  [pdf, other

    cs.LG cs.AI cs.CV q-bio.QM

    Pelphix: Surgical Phase Recognition from X-ray Images in Percutaneous Pelvic Fixation

    Authors: Benjamin D. Killeen, Han Zhang, Jan Mangulabnan, Mehran Armand, Russel H. Taylor, Greg Osgood, Mathias Unberath

    Abstract: Surgical phase recognition (SPR) is a crucial element in the digital transformation of the modern operating theater. While SPR based on video sources is well-established, incorporation of interventional X-ray sequences has not yet been explored. This paper presents Pelphix, a first approach to SPR for X-ray-guided percutaneous pelvic fracture fixation, which models the procedure at four levels of… ▽ More

    Submitted 18 April, 2023; originally announced April 2023.

  25. arXiv:2304.07132  [pdf, other

    cs.LG q-bio.BM

    Towards Controllable Diffusion Models via Reward-Guided Exploration

    Authors: Hengtong Zhang, Tingyang Xu

    Abstract: By formulating data samples' formation as a Markov denoising process, diffusion models achieve state-of-the-art performances in a collection of tasks. Recently, many variants of diffusion models have been proposed to enable controlled sample generation. Most of these existing methods either formulate the controlling information as an input (i.e.,: conditional representation) for the noise approxim… ▽ More

    Submitted 14 April, 2023; originally announced April 2023.

  26. arXiv:2303.14193  [pdf, other

    q-bio.MN cs.CE

    Quadratic Graph Attention Network (Q-GAT) for Robust Construction of Gene Regulatory Networks

    Authors: Hui Zhang, Xuexin An, Qiang He, Yudong Yao, Yudong Zhang, Feng-Lei Fan, Yueyang Teng

    Abstract: Gene regulatory relationships can be abstracted as a gene regulatory network (GRN), which plays a key role in characterizing complex cellular processes and pathways. Recently, graph neural networks (GNNs), as a class of deep learning models, have emerged as a useful tool to infer gene regulatory relationships from gene expression data. However, deep learning models have been found to be vulnerable… ▽ More

    Submitted 4 November, 2023; v1 submitted 23 March, 2023; originally announced March 2023.

  27. arXiv:2302.11314  [pdf, other

    cs.DB q-bio.QM

    SGMFQP:An Ontology-based Swine Gut Microbiota Federated Query Platform

    Authors: Ying Wang, Qin Jiang, Yilin Geng, Yuren Hu, Yue Tang, Jixiang Li, Junmei Zhang, Wolfgang Mayer, Shanmei Liu, Hong-Yu Zhang, Xianghua Yan, Zaiwen Feng

    Abstract: Gut microbiota plays a crucial role in modulating pig development and health, and gut microbiota characteristics are associated with differences in feed efficiency. To answer open questions in feed efficiency analysis, biologists seek to retrieve information across multiple heterogeneous data sources. However, this is error-prone and time-consuming work since the queries can involve a sequence of… ▽ More

    Submitted 22 February, 2023; originally announced February 2023.

  28. arXiv:2302.10406  [pdf

    cs.CL cs.CV cs.LG eess.IV q-bio.QM

    Time to Embrace Natural Language Processing (NLP)-based Digital Pathology: Benchmarking NLP- and Convolutional Neural Network-based Deep Learning Pipelines

    Authors: Min Cen, Xingyu Li, Bangwei Guo, Jitendra Jonnagaddala, Hong Zhang, Xu Steven Xu

    Abstract: NLP-based computer vision models, particularly vision transformers, have been shown to outperform CNN models in many imaging tasks. However, most digital pathology artificial-intelligence models are based on CNN architectures, probably owing to a lack of data regarding NLP models for pathology images. In this study, we developed digital pathology pipelines to benchmark the five most recently propo… ▽ More

    Submitted 20 February, 2023; originally announced February 2023.

  29. arXiv:2211.16712  [pdf, other

    cs.LG q-bio.QM

    Coordinating Cross-modal Distillation for Molecular Property Prediction

    Authors: Hao Zhang, Nan Zhang, Ruixin Zhang, Lei Shen, Yingyi Zhang, Meng Liu

    Abstract: In recent years, molecular graph representation learning (GRL) has drawn much more attention in molecular property prediction (MPP) problems. The existing graph methods have demonstrated that 3D geometric information is significant for better performance in MPP. However, accurate 3D structures are often costly and time-consuming to obtain, limiting the large-scale application of GRL. It is an intu… ▽ More

    Submitted 29 November, 2022; originally announced November 2022.

  30. arXiv:2210.14982  [pdf, other

    q-bio.BM cs.DS physics.bio-ph q-bio.QM

    LinearCoFold and LinearCoPartition: Linear-Time Algorithms for Secondary Structure Prediction of Interacting RNA molecules

    Authors: He Zhang, Sizhen Li, Liang Zhang, David H. Mathews, Liang Huang

    Abstract: Many ncRNAs function through RNA-RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA-RNA interaction is useful. Some existing tools are less accurate due to omitting the competing of intermolecular and intramolecular base pairs, or focus more on predicting the binding region rather than predicting the complete secondary structure of two interacting strands. Vienn… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  31. arXiv:2210.05677  [pdf

    q-bio.GN cs.LG

    Application of Deep Learning on Single-Cell RNA-sequencing Data Analysis: A Review

    Authors: Matthew Brendel, Chang Su, Zilong Bai, Hao Zhang, Olivier Elemento, Fei Wang

    Abstract: Single-cell RNA-sequencing (scRNA-seq) has become a routinely used technique to quantify the gene expression profile of thousands of single cells simultaneously. Analysis of scRNA-seq data plays an important role in the study of cell states and phenotypes, and has helped elucidate biological processes, such as those occurring during development of complex organisms and improved our understanding o… ▽ More

    Submitted 11 October, 2022; originally announced October 2022.

  32. arXiv:2209.11232  [pdf

    eess.IV cs.AI cs.CV q-bio.NC

    Hierarchical Graph Convolutional Network Built by Multiscale Atlases for Brain Disorder Diagnosis Using Functional Connectivity

    Authors: Mianxin Liu, Han Zhang, Feng Shi, Dinggang Shen

    Abstract: Functional connectivity network (FCN) data from functional magnetic resonance imaging (fMRI) is increasingly used for the diagnoses of brain disorders. However, state-of-the-art studies used to build the FCN using a single brain parcellation atlas at a certain spatial scale, which largely neglected functional interactions across different spatial scales in hierarchical manners. In this study, we p… ▽ More

    Submitted 22 September, 2022; originally announced September 2022.

  33. Interpreting the Mechanism of Synergism for Drug Combinations Using Attention-Based Hierarchical Graph Pooling

    Authors: Zehao Dong, Heming Zhang, Yixin Chen, Philip R. O. Payne, Fuhai Li

    Abstract: Synergistic drug combinations provide huge potentials to enhance therapeutic efficacy and to reduce adverse reactions. However, effective and synergistic drug combination prediction remains an open question because of the unknown causal disease signaling pathways. Though various deep learning (AI) models have been proposed to quantitatively predict the synergism of drug combinations, the major lim… ▽ More

    Submitted 22 August, 2023; v1 submitted 19 September, 2022; originally announced September 2022.

    Journal ref: Cancers 2023, 15(17), 4210

  34. arXiv:2209.07921  [pdf, other

    cs.LG cs.AI q-bio.QM

    ImDrug: A Benchmark for Deep Imbalanced Learning in AI-aided Drug Discovery

    Authors: Lanqing Li, Liang Zeng, Ziqi Gao, Shen Yuan, Yatao Bian, Bingzhe Wu, Hengtong Zhang, Yang Yu, Chan Lu, Zhipeng Zhou, Hongteng Xu, Jia Li, Peilin Zhao, Pheng-Ann Heng

    Abstract: The last decade has witnessed a prosperous development of computational methods and dataset curation for AI-aided drug discovery (AIDD). However, real-world pharmaceutical datasets often exhibit highly imbalanced distribution, which is overlooked by the current literature but may severely compromise the fairness and generalization of machine learning applications. Motivated by this observation, we… ▽ More

    Submitted 17 October, 2022; v1 submitted 16 September, 2022; originally announced September 2022.

    Comments: 29 pages, 7 figures, 8 tables, a machine learning benchmark submission

  35. arXiv:2209.05710  [pdf, other

    cs.LG q-bio.BM

    MDM: Molecular Diffusion Model for 3D Molecule Generation

    Authors: Lei Huang, Hengtong Zhang, Tingyang Xu, Ka-Chun Wong

    Abstract: Molecule generation, especially generating 3D molecular geometries from scratch (i.e., 3D \textit{de novo} generation), has become a fundamental task in drug designs. Existing diffusion-based 3D molecule generation methods could suffer from unsatisfactory performances, especially when generating large molecules. At the same time, the generated molecules lack enough diversity. This paper proposes a… ▽ More

    Submitted 12 September, 2022; originally announced September 2022.

    Comments: Submitted to AAAI'23

  36. arXiv:2208.11518  [pdf

    q-bio.QM

    Prognostic Significance of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images in Colorectal Cancers

    Authors: Anran Liu, Xingyu Li, Hongyi Wu, Bangwei Guo, Jitendra Jonnagaddala, Hong Zhang, Xu Steven Xu

    Abstract: Purpose Tumor-infiltrating lymphocytes (TILs) have significant prognostic values in cancers. However, very few automated, deep-learning-based TIL scoring algorithms have been developed for colorectal cancers (CRC). Methods We developed an automated, multiscale LinkNet workflow for quantifying cellular-level TILs for CRC tumors using H&E-stained images. The predictive performance of the automatic T… ▽ More

    Submitted 15 September, 2022; v1 submitted 23 August, 2022; originally announced August 2022.

  37. arXiv:2208.10495  [pdf

    q-bio.QM cs.LG eess.IV

    Predicting microsatellite instability and key biomarkers in colorectal cancer from H&E-stained images: Achieving SOTA predictive performance with fewer data using Swin Transformer

    Authors: Bangwei Guo, Xingyu Li, Jitendra Jonnagaddala, Hong Zhang, Xu Steven Xu

    Abstract: Artificial intelligence (AI) models have been developed for predicting clinically relevant biomarkers, including microsatellite instability (MSI), for colorectal cancers (CRC). However, the current deep-learning networks are data-hungry and require large training datasets, which are often lacking in the medical domain. In this study, based on the latest Hierarchical Vision Transformer using Shifte… ▽ More

    Submitted 11 September, 2022; v1 submitted 21 August, 2022; originally announced August 2022.

  38. arXiv:2208.01169  [pdf, other

    q-bio.PE cs.NE stat.ML

    A Modified PINN Approach for Identifiable Compartmental Models in Epidemiology with Applications to COVID-19

    Authors: Haoran Hu, Connor M Kennedy, Panayotis G. Kevrekidis, Hongkun Zhang

    Abstract: A variety of approaches using compartmental models have been used to study the COVID-19 pandemic and the usage of machine learning methods with these models has had particularly notable success. We present here an approach toward analyzing accessible data on Covid-19's U.S. development using a variation of the "Physics Informed Neural Networks" (PINN) which is capable of using the knowledge of the… ▽ More

    Submitted 17 August, 2022; v1 submitted 1 August, 2022; originally announced August 2022.

    Comments: 22 pages, 8 figures, to be submitted to Viruses

  39. arXiv:2207.07734  [pdf, other

    q-bio.GN cs.AI cs.GL

    COEM: Cross-Modal Embedding for MetaCell Identification

    Authors: Haiyi Mao, Minxue Jia, Jason Xiaotian Dou, Haotian Zhang, Panayiotis V. Benos

    Abstract: Metacells are disjoint and homogeneous groups of single-cell profiles, representing discrete and highly granular cell states. Existing metacell algorithms tend to use only one modality to infer metacells, even though single-cell multi-omics datasets profile multiple molecular modalities within the same cell. Here, we present \textbf{C}ross-M\textbf{O}dal \textbf{E}mbedding for \textbf{M}etaCell Id… ▽ More

    Submitted 24 July, 2022; v1 submitted 15 July, 2022; originally announced July 2022.

    Comments: 5 pages, 2 figures, ICML workshop on computational biology

  40. arXiv:2206.14794  [pdf, ps, other

    q-bio.BM cs.DS physics.bio-ph q-bio.QM

    LinearAlifold: Linear-Time Consensus Structure Prediction for RNA Alignments

    Authors: Apoorv Malik, Liang Zhang, Milan Gautam, Ning Dai, Sizhen Li, He Zhang, David H. Mathews, Liang Huang

    Abstract: Predicting the consensus structure of a set of aligned RNA homologs is a convenient method to find conserved structures in an RNA genome, which has many applications including viral diagnostics and therapeutics. However, the most commonly used tool for this task, RNAalifold, is prohibitively slow for long sequences, due to a cubic scaling with the sequence length, taking over a day on 400 SARS-CoV… ▽ More

    Submitted 5 July, 2024; v1 submitted 29 June, 2022; originally announced June 2022.

  41. arXiv:2206.13574  [pdf, ps, other

    q-bio.GN q-bio.QM

    Fast sequence to graph alignment using the graph wavefront algorithm

    Authors: Haowen Zhang, Shiqi Wu, Srinivas Aluru, Heng Li

    Abstract: Motivation: A pan-genome graph represents a collection of genomes and encodes sequence variations between them. It is a powerful data structure for studying multiple similar genomes. Sequence-to-graph alignment is an essential step for the construction and the analysis of pan-genome graphs. However, existing algorithms incur runtime proportional to the product of sequence length and graph size, ma… ▽ More

    Submitted 27 June, 2022; originally announced June 2022.

  42. arXiv:2206.00455  [pdf

    q-bio.QM cs.AI cs.CV cs.LG q-bio.GN

    A robust and lightweight deep attention multiple instance learning algorithm for predicting genetic alterations

    Authors: Bangwei Guo, Xingyu Li, Miaomiao Yang, Hong Zhang, Xu Steven Xu

    Abstract: Deep-learning models based on whole-slide digital pathology images (WSIs) become increasingly popular for predicting molecular biomarkers. Instance-based models has been the mainstream strategy for predicting genetic alterations using WSIs although bag-based models along with self-attention mechanism-based algorithms have been proposed for other digital pathology applications. In this paper, we pr… ▽ More

    Submitted 31 May, 2022; originally announced June 2022.

  43. arXiv:2205.03834  [pdf

    q-bio.MN cs.LG

    FP-GNN: a versatile deep learning architecture for enhanced molecular property prediction

    Authors: Hanxuan Cai, Huimin Zhang, Duancheng Zhao, Jingxing Wu, Ling Wang

    Abstract: Deep learning is an important method for molecular design and exhibits considerable ability to predict molecular properties, including physicochemical, bioactive, and ADME/T (absorption, distribution, metabolism, excretion, and toxicity) properties. In this study, we advanced a novel deep learning architecture, termed FP-GNN, which combined and simultaneously learned information from molecular gra… ▽ More

    Submitted 8 May, 2022; originally announced May 2022.

  44. arXiv:2204.02855  [pdf, other

    cs.ET cs.IT math.CO q-bio.GN

    SPIDER-WEB generates coding algorithms with superior error tolerance and real-time information retrieval capacity

    Authors: Haoling Zhang, Zhaojun Lan, Wenwei Zhang, Xun Xu, Zhi Ping, Yiwei Zhang, Yue Shen

    Abstract: DNA has been considered a promising medium for storing digital information. As an essential step in the DNA-based data storage workflow, coding algorithms are responsible to implement functions including bit-to-base transcoding, error correction, etc. In previous studies, these functions are normally realized by introducing multiple algorithms. Here, we report a graph-based architecture, named SPI… ▽ More

    Submitted 30 March, 2023; v1 submitted 6 April, 2022; originally announced April 2022.

    Comments: 47 pages; 13 figures; 8 tables

    MSC Class: 46N60; 94C15; 94B70; 68P25 ACM Class: I.1.2; D.2.8; E.3; G.2.2

  45. arXiv:2204.01593  [pdf

    q-bio.QM cs.AI cs.CV cs.LG eess.IV

    Optimize Deep Learning Models for Prediction of Gene Mutations Using Unsupervised Clustering

    Authors: Zihan Chen, Xingyu Li, Miaomiao Yang, Hong Zhang, Xu Steven Xu

    Abstract: Deep learning has become the mainstream methodological choice for analyzing and interpreting whole-slide digital pathology images (WSIs). It is commonly assumed that tumor regions carry most predictive information. In this paper, we proposed an unsupervised clustering-based multiple-instance learning, and apply our method to develop deep-learning models for prediction of gene mutations using WSIs… ▽ More

    Submitted 24 April, 2022; v1 submitted 31 March, 2022; originally announced April 2022.

  46. arXiv:2203.08820  [pdf, other

    q-bio.QM cs.CV cs.LG

    DePS: An improved deep learning model for de novo peptide sequencing

    Authors: Cheng Ge, Yi Lu, Jia Qu, Liangxu Xie, Feng Wang, Hong Zhang, Ren Kong, Shan Chang

    Abstract: De novo peptide sequencing from mass spectrometry data is an important method for protein identification. Recently, various deep learning approaches were applied for de novo peptide sequencing and DeepNovoV2 is one of the represetative models. In this study, we proposed an enhanced model, DePS, which can improve the accuracy of de novo peptide sequencing even with missing signal peaks or large num… ▽ More

    Submitted 16 March, 2022; originally announced March 2022.

    Comments: 10 pages, 7 figures

  47. arXiv:2203.00743  [pdf

    q-bio.QM

    Uncovering the dynamic effects of DEX treatment on lung cancer by integrating bioinformatic inference and multiscale modeling of scRNA-seq and proteomics data

    Authors: Minghan Chen, Chunrui Xu, Ziang Xu, Wei He, Haorui Zhang, Jing Su, Qianqian Song

    Abstract: Motivation: Lung cancer is one of the leading causes for cancer-related death, with a five-year survival rate of 18%. It is a priority for us to understand the underlying mechanisms that affect the implementation and effectiveness of lung cancer therapeutics. In this study, we combine the power of Bioinformatics and Systems Biology to comprehensively uncover functional and signaling pathways of dr… ▽ More

    Submitted 1 March, 2022; originally announced March 2022.

  48. arXiv:2201.08941  [pdf

    q-bio.QM q-bio.NC

    Uncovering the System Vulnerability and Criticality of Human Brain under Dynamical Neuropathological Events in Alzheimer's Disease

    Authors: Jingwen Zhang, Qing Liu, Haorui Zhang, Michelle Dai, Qianqian Song, Defu Yang, Guorong Wu, Minghan Chen

    Abstract: Background: Despite the striking efforts in investigating neurobiological factors behind the acquisition of amyloid-\b{eta} (A), protein tau (T), and neurodegeneration ([N]) biomarkers, the mechanistic pathways of how AT[N] biomarkers spreading throughout the brain remain elusive. Objectives: To disentangle the massive heterogeneities in AD progressions and identify vulnerable/critical brain regio… ▽ More

    Submitted 21 August, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

  49. arXiv:2110.05400  [pdf

    q-bio.QM

    MAGORINO: Magnitude-only fat fraction and R2* estimation with Rician noise modelling

    Authors: Timothy JP Bray, Alan Bainbridge, Margaret A Hall-Craggs, Hui Zhang

    Abstract: Purpose: Magnitude-based fitting of chemical shift-encoded data enables proton density fat fraction (PDFF) and R2* estimation where complex-based methods fail or when phase data is inaccessible or unreliable, such as in multi-centre studies. However, traditional magnitude-based fitting algorithms suffer from Rician noise-related bias and fat-water swaps. To address these issues, we propose an algo… ▽ More

    Submitted 3 March, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

  50. arXiv:2109.04361  [pdf, other

    eess.SP cs.LG q-bio.NC

    MutualGraphNet: A novel model for motor imagery classification

    Authors: Yan Li, Ning Zhong, David Taniar, Haolan Zhang

    Abstract: Motor imagery classification is of great significance to humans with mobility impairments, and how to extract and utilize the effective features from motor imagery electroencephalogram(EEG) channels has always been the focus of attention. There are many different methods for the motor imagery classification, but the limited understanding on human brain requires more effective methods for extractin… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.