Skip to main content

Showing 1–50 of 114 results for author: Zhang, L

  1. arXiv:2407.09540  [pdf, other

    eess.IV cs.CE cs.CV cs.LG q-bio.TO

    Prompting Whole Slide Image Based Genetic Biomarker Prediction

    Authors: Ling Zhang, Boxiang Yun, Xingran Xie, Qingli Li, Xinxing Li, Yan Wang

    Abstract: Prediction of genetic biomarkers, e.g., microsatellite instability and BRAF in colorectal cancer is crucial for clinical decision making. In this paper, we propose a whole slide image (WSI) based genetic biomarker prediction method via prompting techniques. Our work aims at addressing the following challenges: (1) extracting foreground instances related to genetic biomarkers from gigapixel WSIs, a… ▽ More

    Submitted 26 June, 2024; originally announced July 2024.

    Comments: 11 pages, 3 figures, MICCAI2024

  2. arXiv:2406.15514  [pdf, other

    physics.soc-ph q-bio.PE stat.ME

    How big does a population need to be before demographers can ignore individual-level randomness in demographic events?

    Authors: John Bryant, Tahu Kukutai, Junni L. Zhang

    Abstract: When studying a national-level population, demographers can safely ignore the effect of individual-level randomness on age-sex structure. When studying a single community, or group of communities, however, the potential importance of individual-level randomness is less clear. We seek to measure the effect of individual-level randomness in births and deaths on standard summary indicators of age-sex… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

    Comments: 28 pages, 8 figures, 3 tables

    MSC Class: 91-XX

  3. arXiv:2406.13133  [pdf, other

    cs.CL cs.LG q-bio.GN

    PathoLM: Identifying pathogenicity from the DNA sequence through the Genome Foundation Model

    Authors: Sajib Acharjee Dip, Uddip Acharjee Shuvo, Tran Chau, Haoqiu Song, Petra Choi, Xuan Wang, Liqing Zhang

    Abstract: Pathogen identification is pivotal in diagnosing, treating, and preventing diseases, crucial for controlling infections and safeguarding public health. Traditional alignment-based methods, though widely used, are computationally intense and reliant on extensive reference databases, often failing to detect novel pathogens due to their low sensitivity and specificity. Similarly, conventional machine… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 9 pages, 3 figures

  4. arXiv:2406.09817  [pdf, other

    physics.chem-ph q-bio.BM

    Efficient and Precise Force Field Optimization for Biomolecules Using DPA-2

    Authors: Junhan Chang, Duo Zhang, Yuqing Deng, Hongrui Lin, Zhirong Liu, Linfeng Zhang, Hang Zheng, Xinyan Wang

    Abstract: Molecular simulations are essential tools in computational chemistry, enabling the prediction and understanding of molecular interactions and thermodynamic properties of biomolecules. However, traditional force fields face significant challenges in accurately representing novel molecules and complex chemical environments due to the labor-intensive process of manually setting optimization parameter… ▽ More

    Submitted 14 June, 2024; originally announced June 2024.

  5. arXiv:2406.00085  [pdf, other

    eess.IV cs.LG q-bio.NC

    Augmentation-based Unsupervised Cross-Domain Functional MRI Adaptation for Major Depressive Disorder Identification

    Authors: Yunling Ma, Chaojun Zhang, Xiaochuan Wang, Qianqian Wang, Liang Cao, Limei Zhang, Mingxia Liu

    Abstract: Major depressive disorder (MDD) is a common mental disorder that typically affects a person's mood, cognition, behavior, and physical health. Resting-state functional magnetic resonance imaging (rs-fMRI) data are widely used for computer-aided diagnosis of MDD. While multi-site fMRI data can provide more data for training reliable diagnostic models, significant cross-site data heterogeneity would… ▽ More

    Submitted 6 June, 2024; v1 submitted 31 May, 2024; originally announced June 2024.

  6. arXiv:2405.19420  [pdf, other

    cs.LG cs.AI q-bio.NC

    Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

    Authors: Raja Marjieh, Sreejan Kumar, Declan Campbell, Liyi Zhang, Gianluca Bencomo, Jake Snell, Thomas L. Griffiths

    Abstract: Humans rely on strong inductive biases to learn from few examples and abstract useful information from sensory data. Instilling such biases in machine learning models has been shown to improve their performance on various benchmarks including few-shot learning, robustness, and alignment. However, finding effective training procedures to achieve that goal can be challenging as psychologically-rich… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  7. arXiv:2405.11769  [pdf, other

    q-bio.BM cs.LG physics.bio-ph

    Uni-Mol Docking V2: Towards Realistic and Accurate Binding Pose Prediction

    Authors: Eric Alcaide, Zhifeng Gao, Guolin Ke, Yaqi Li, Linfeng Zhang, Hang Zheng, Gengmo Zhou

    Abstract: In recent years, machine learning (ML) methods have emerged as promising alternatives for molecular docking, offering the potential for high accuracy without incurring prohibitive computational costs. However, recent studies have indicated that these ML models may overfit to quantitative metrics while neglecting the physical constraints inherent in the problem. In this work, we present Uni-Mol Doc… ▽ More

    Submitted 20 May, 2024; originally announced May 2024.

  8. arXiv:2405.07110  [pdf, other

    q-bio.PE cs.DS math.CO

    A Vector Representation for Phylogenetic Trees

    Authors: Cedric Chauve, Caroline Colijn, Louxin Zhang

    Abstract: Good representations for phylogenetic trees and networks are important for optimizing storage efficiency and implementation of scalable methods for the inference and analysis of evolutionary trees for genes, genomes and species. We introduce a new representation for rooted phylogenetic trees that encodes a binary tree on n taxa as a vector of length 2n in which each taxon appears exactly twice. Us… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

    MSC Class: 05C05; 92B10 ACM Class: G.2.2; J.3

  9. arXiv:2404.06691  [pdf

    q-bio.BM cs.LG cs.NE

    Latent Chemical Space Searching for Plug-in Multi-objective Molecule Generation

    Authors: Ningfeng Liu, Jie Yu, Siyu Xiu, Xinfang Zhao, Siyu Lin, Bo Qiang, Ruqiu Zheng, Hongwei Jin, Liangren Zhang, Zhenming Liu

    Abstract: Molecular generation, an essential method for identifying new drug structures, has been supported by advancements in machine learning and computational technology. However, challenges remain in multi-objective generation, model adaptability, and practical application in drug discovery. In this study, we developed a versatile 'plug-in' molecular generation model that incorporates multiple objective… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

  10. arXiv:2402.02004  [pdf

    q-bio.BM

    Enhancing the efficiency of protein language models with minimal wet-lab data through few-shot learning

    Authors: Ziyi Zhou, Liang Zhang, Yuanxi Yu, Mingchen Li, Liang Hong, Pan Tan

    Abstract: Accurately modeling the protein fitness landscapes holds great importance for protein engineering. Recently, due to their capacity and representation ability, pre-trained protein language models have achieved state-of-the-art performance in predicting protein fitness without experimental data. However, their predictions are limited in accuracy as well as interpretability. Furthermore, such deep le… ▽ More

    Submitted 2 February, 2024; originally announced February 2024.

  11. arXiv:2311.18218  [pdf, other

    q-bio.PE

    Computing the Bounds of the Number of Reticulations in a Tree-Child Network That Displays a Set of Trees

    Authors: Yufeng Wu, Louxin Zhang

    Abstract: Phylogenetic network is an evolutionary model that uses a rooted directed acyclic graph (instead of a tree) to model an evolutionary history of species in which reticulate events (e.g., hybrid speciation or horizontal gene transfer) occurred. Tree-child network is a kind of phylogenetic network with structural constraints. Existing approaches for tree-child network reconstruction can be slow for l… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: 7 figures, 1 table, 33 pages

    MSC Class: 05C30 ACM Class: J.3

  12. arXiv:2311.01276  [pdf, other

    cs.LG q-bio.QM

    Neural Atoms: Propagating Long-range Interaction in Molecular Graphs through Efficient Communication Channel

    Authors: Xuan Li, Zhanke Zhou, Jiangchao Yao, Yu Rong, Lu Zhang, Bo Han

    Abstract: Graph Neural Networks (GNNs) have been widely adopted for drug discovery with molecular graphs. Nevertheless, current GNNs mainly excel in leveraging short-range interactions (SRI) but struggle to capture long-range interactions (LRI), both of which are crucial for determining molecular properties. To tackle this issue, we propose a method to abstract the collective information of atomic groups in… ▽ More

    Submitted 31 March, 2024; v1 submitted 2 November, 2023; originally announced November 2023.

  13. arXiv:2309.07701  [pdf

    cs.HC eess.SP q-bio.NC

    Semantic reconstruction of continuous language from MEG signals

    Authors: Bo Wang, Xiran Xu, Longxiang Zhang, Boda Xiao, Xihong Wu, Jing Chen

    Abstract: Decoding language from neural signals holds considerable theoretical and practical importance. Previous research has indicated the feasibility of decoding text or speech from invasive neural signals. However, when using non-invasive neural signals, significant challenges are encountered due to their low quality. In this study, we proposed a data-driven approach for decoding semantic of language fr… ▽ More

    Submitted 14 September, 2023; originally announced September 2023.

  14. arXiv:2309.03907  [pdf, other

    q-bio.BM cs.LG

    DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs

    Authors: Youwei Liang, Ruiyi Zhang, Li Zhang, Pengtao Xie

    Abstract: A ChatGPT-like system for drug compounds could be a game-changer in pharmaceutical research, accelerating drug discovery, enhancing our understanding of structure-activity relationships, guiding lead optimization, aiding drug repurposing, reducing the failure rate, and streamlining clinical trials. In this work, we make an initial attempt towards enabling ChatGPT-like capabilities on drug molecule… ▽ More

    Submitted 18 May, 2023; originally announced September 2023.

  15. Deep neural network improves the estimation of polygenic risk scores for breast cancer

    Authors: Adrien Badré, Li Zhang, Wellington Muchero, Justin C. Reynolds, Chongle Pan

    Abstract: Polygenic risk scores (PRS) estimate the genetic risk of an individual for a complex disease based on many genetic variants across the whole genome. In this study, we compared a series of computational models for estimation of breast cancer PRS. A deep neural network (DNN) was found to outperform alternative machine learning techniques and established statistical algorithms, including BLUP, BayesA… ▽ More

    Submitted 24 July, 2023; originally announced July 2023.

    Comments: 28 pages, 7 figures, 2 Tables

    Journal ref: A. Badré, L. Zhang, W. Muchero, J.C. Reynolds, C. Pan (2021). Deep neural network improves the estimation of polygenic risk scores for breast cancer. Journal of Human Genetics, 66(4), 359-369

  16. arXiv:2307.12682  [pdf

    q-bio.BM

    Pro-PRIME: A general Temperature-Guided Language model to engineer enhanced Stability and Activity in Proteins

    Authors: Pan Tan, Mingchen Li, Yuanxi Yu, Fan Jiang, Lirong Zheng, Banghao Wu, Xinyu Sun, Liqi Kang, Jie Song, Liang Zhang, Yi Xiong, Wanli Ouyang, Zhiqiang Hu, Guisheng Fan, Yufeng Pei, Liang Hong

    Abstract: Designing protein mutants of both high stability and activity is a critical yet challenging task in protein engineering. Here, we introduce Pro-PRIME, a deep learning zero-shot model, which can suggest protein mutants of improved stability and activity without any prior experimental mutagenesis data. By leveraging temperature-guided language modelling, Pro-PRIME demonstrated superior predictive po… ▽ More

    Submitted 13 May, 2024; v1 submitted 24 July, 2023; originally announced July 2023.

    Comments: arXiv admin note: text overlap with arXiv:2304.03780

  17. arXiv:2306.08907  [pdf

    q-bio.BM cs.LG

    MCPI: Integrating Multimodal Data for Enhanced Prediction of Compound Protein Interactions

    Authors: Li Zhang, Wenhao Li, Haotian Guan, Zhiquan He, Mingjun Cheng, Han Wang

    Abstract: The identification of compound-protein interactions (CPI) plays a critical role in drug screening, drug repurposing, and combination therapy studies. The effectiveness of CPI prediction relies heavily on the features extracted from both compounds and target proteins. While various prediction methods employ different feature combinations, both molecular-based and network-based models encounter the… ▽ More

    Submitted 15 June, 2023; originally announced June 2023.

    Comments: 12 pages, 9 figures

  18. arXiv:2306.07505  [pdf

    q-bio.TO eess.IV

    Deep learning radiomics for assessment of gastroesophageal varices in people with compensated advanced chronic liver disease

    Authors: Lan Wang, Ruiling He, Lili Zhao, Jia Wang, Zhengzi Geng, Tao Ren, Guo Zhang, Peng Zhang, Kaiqiang Tang, Chaofei Gao, Fei Chen, Liting Zhang, Yonghe Zhou, Xin Li, Fanbin He, Hui Huan, Wenjuan Wang, Yunxiao Liang, Juan Tang, Fang Ai, Tingyu Wang, Liyun Zheng, Zhongwei Zhao, Jiansong Ji, Wei Liu , et al. (22 additional authors not shown)

    Abstract: Objective: Bleeding from gastroesophageal varices (GEV) is a medical emergency associated with high mortality. We aim to construct an artificial intelligence-based model of two-dimensional shear wave elastography (2D-SWE) of the liver and spleen to precisely assess the risk of GEV and high-risk gastroesophageal varices (HRV). Design: A prospective multicenter study was conducted in patients with… ▽ More

    Submitted 12 June, 2023; originally announced June 2023.

  19. arXiv:2306.01824  [pdf, other

    q-bio.QM cs.CE cs.LG q-bio.BM

    Enhancing the Protein Tertiary Structure Prediction by Multiple Sequence Alignment Generation

    Authors: Le Zhang, Jiayang Chen, Tao Shen, Yu Li, Siqi Sun

    Abstract: The field of protein folding research has been greatly advanced by deep learning methods, with AlphaFold2 (AF2) demonstrating exceptional performance and atomic-level precision. As co-evolution is integral to protein structure prediction, AF2's accuracy is significantly influenced by the depth of multiple sequence alignment (MSA), which requires extensive exploration of a large protein database fo… ▽ More

    Submitted 2 June, 2023; originally announced June 2023.

  20. arXiv:2305.03056  [pdf

    eess.IV q-bio.NC

    Biomarker Investigation using Multiple Brain Measures from MRI through XAI in Alzheimer's Disease Classification

    Authors: Davide Coluzzi, Valentina Bordin, Massimo Walter Rivolta, Igor Fortel, Liang Zhang, Alex Leow, Giuseppe Baselli

    Abstract: Alzheimer's Disease (AD) is the world leading cause of dementia, a progressively impairing condition leading to high hospitalization rates and mortality. To optimize the diagnostic process, numerous efforts have been directed towards the development of deep learning approaches (DL) for the automatic AD classification. However, their typical black box outline has led to low trust and scarce usage w… ▽ More

    Submitted 3 May, 2023; originally announced May 2023.

    Comments: 26 pages, 5 figures

  21. arXiv:2305.00063  [pdf, other

    q-bio.NC q-bio.QM

    Applications of Computer Vision in Analysis of the Clock-Drawing Test as a Metric of Cognitive Impairment

    Authors: Luzhou Zhang

    Abstract: The Clock-Drawing test is a well known and widely used neuropsychological metric to assess basic cognitive function. My objective is to combine methods of machine learning in computer vision and image analysis to predict a subject's level of cognitive impairment.

    Submitted 26 April, 2023; originally announced May 2023.

    Comments: 5 pages, 3 figures

  22. arXiv:2304.12239  [pdf, other

    q-bio.BM cs.LG

    Uni-QSAR: an Auto-ML Tool for Molecular Property Prediction

    Authors: Zhifeng Gao, Xiaohong Ji, Guojiang Zhao, Hongshuai Wang, Hang Zheng, Guolin Ke, Linfeng Zhang

    Abstract: Recently deep learning based quantitative structure-activity relationship (QSAR) models has shown surpassing performance than traditional methods for property prediction tasks in drug discovery. However, most DL based QSAR models are restricted to limited labeled data to achieve better performance, and also are sensitive to model scale and hyper-parameters. In this paper, we propose Uni-QSAR, a po… ▽ More

    Submitted 24 April, 2023; originally announced April 2023.

  23. arXiv:2304.03780   

    q-bio.QM

    TemPL: A Novel Deep Learning Model for Zero-Shot Prediction of Protein Stability and Activity Based on Temperature-Guided Language Modeling

    Authors: Pan Tan, Mingchen Li, Liang Zhang, Zhiqiang Hu, Liang Hong

    Abstract: We introduce TemPL, a novel deep learning approach for zero-shot prediction of protein stability and activity, harnessing temperature-guided language modeling. By assembling an extensive dataset of 96 million sequence-host bacterial strain optimal growth temperatures (OGTs) and ΔTm data for point mutations under consistent experimental conditions, we effectively compared TemPL with state-of-the-ar… ▽ More

    Submitted 13 May, 2024; v1 submitted 7 April, 2023; originally announced April 2023.

    Comments: This project has been terminated

  24. arXiv:2303.15569  [pdf, ps, other

    cs.LG q-bio.NC

    Core-Periphery Principle Guided Redesign of Self-Attention in Transformers

    Authors: Xiaowei Yu, Lu Zhang, Haixing Dai, Yanjun Lyu, Lin Zhao, Zihao Wu, David Liu, Tianming Liu, Dajiang Zhu

    Abstract: Designing more efficient, reliable, and explainable neural network architectures is critical to studies that are based on artificial intelligence (AI) techniques. Previous studies, by post-hoc analysis, have found that the best-performing ANNs surprisingly resemble biological neural networks (BNN), which indicates that ANNs and BNNs may share some common principles to achieve optimal performance i… ▽ More

    Submitted 27 March, 2023; originally announced March 2023.

    Comments: Core-periphery, functional brain networks, ViT

  25. Bridging the Gap between Chemical Reaction Pretraining and Conditional Molecule Generation with a Unified Model

    Authors: Bo Qiang, Yiran Zhou, Yuheng Ding, Ningfeng Liu, Song Song, Liangren Zhang, Bo Huang, Zhenming Liu

    Abstract: Chemical reactions are the fundamental building blocks of drug design and organic chemistry research. In recent years, there has been a growing need for a large-scale deep-learning framework that can efficiently capture the basic rules of chemical reactions. In this paper, we have proposed a unified framework that addresses both the reaction representation learning and molecule generation tasks, w… ▽ More

    Submitted 7 March, 2024; v1 submitted 13 March, 2023; originally announced March 2023.

  26. arXiv:2302.00146  [pdf, other

    q-bio.NC

    Gyri vs. Sulci: Disentangling Brain Core-Periphery Functional Networks via Twin-Transformer

    Authors: Xiaowei Yu, Lu Zhang, Haixing Dai, Lin Zhao, Yanjun Lyu, Zihao Wu, Tianming Liu, Dajiang Zhu

    Abstract: The human cerebral cortex is highly convoluted into convex gyri and concave sulci. It has been demonstrated that gyri and sulci are significantly different in their anatomy, connectivity, and function, besides exhibiting opposite shape patterns, long-distance axonal fibers connected to gyri are much denser than those connected to sulci, and neural signals on gyri are more complex in low-frequency… ▽ More

    Submitted 31 January, 2023; originally announced February 2023.

    Comments: 13 pages, 4 figures

  27. arXiv:2301.05826  [pdf, other

    q-bio.PE

    The k-Robinson-Foulds Dissimilarity Measures for Comparison of Labeled Trees

    Authors: Elahe Khayatian, Gabriel Valiente, Louxin Zhang

    Abstract: Understanding the mutational history of tumor cells is a critical endeavor in unraveling the mechanisms underlying cancer. Since the modeling of tumor cell evolution employs labeled trees, researchers are motivated to develop different methods to assess and compare mutation trees and other labeled trees. While the Robinson-Foulds distance is a widely utilized metric for comparing phylogenetic tree… ▽ More

    Submitted 16 November, 2023; v1 submitted 14 January, 2023; originally announced January 2023.

    Comments: 39 pages, 11 figures

  28. arXiv:2301.00992  [pdf, other

    q-bio.PE

    A Fast and Scalable Method for Inferring Phylogenetic Networks from Trees by Aligning Lineage Taxon Strings

    Authors: Louxin Zhang, Niloufar Abhari, Caroline Colijn, Yufeng Wu

    Abstract: The reconstruction of phylogenetic networks is an important but challenging problem in phylogenetics and genome evolution, as the space of phylogenetic networks is vast and cannot be sampled well. One approach to the problem is to solve the minimum phylogenetic network problem, in which phylogenetic trees are first inferred, then the smallest phylogenetic network that displays all the trees is com… ▽ More

    Submitted 12 April, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

    Comments: 44 pages, 15 figures

    MSC Class: 05C30 ACM Class: J.3

  29. arXiv:2212.12771  [pdf, ps, other

    cs.LG cs.SI q-bio.QM

    Unsupervised Instance and Subnetwork Selection for Network Data

    Authors: Lin Zhang, Nicholas Moskwa, Melinda Larsen, Petko Bogdanov

    Abstract: Unlike tabular data, features in network data are interconnected within a domain-specific graph. Examples of this setting include gene expression overlaid on a protein interaction network (PPI) and user opinions in a social network. Network data is typically high-dimensional (large number of nodes) and often contains outlier snapshot instances and noise. In addition, it is often non-trivial and ti… ▽ More

    Submitted 24 December, 2022; originally announced December 2022.

  30. arXiv:2210.14982  [pdf, other

    q-bio.BM cs.DS physics.bio-ph q-bio.QM

    LinearCoFold and LinearCoPartition: Linear-Time Algorithms for Secondary Structure Prediction of Interacting RNA molecules

    Authors: He Zhang, Sizhen Li, Liang Zhang, David H. Mathews, Liang Huang

    Abstract: Many ncRNAs function through RNA-RNA interactions. Fast and reliable RNA structure prediction with consideration of RNA-RNA interaction is useful. Some existing tools are less accurate due to omitting the competing of intermolecular and intramolecular base pairs, or focus more on predicting the binding region rather than predicting the complete secondary structure of two interacting strands. Vienn… ▽ More

    Submitted 26 October, 2022; originally announced October 2022.

  31. arXiv:2208.08800  [pdf, other

    physics.bio-ph cond-mat.soft q-bio.CB q-bio.TO

    Localized growth drives spongy mesophyll morphogenesis

    Authors: John D. Treado, Adam B. Roddy, Guillaume Théroux-Rancourt, Liyong Zhang, Chris Ambrose, Craig Brodersen, Mark D. Shattuck, Corey S. O'Hern

    Abstract: The spongy mesophyll is a complex, porous tissue found in plant leaves that enables carbon capture and provides mechanical stability. Unlike many other biological tissues, which remain confluent throughout development, the spongy mesophyll must develop from an initially confluent tissue into a tortuous network of cells with a large proportion of intercellular airspace. How the airspace in the spon… ▽ More

    Submitted 18 August, 2022; originally announced August 2022.

    Comments: 28 pages, 6 figures, 1 table, 9 pages of supplementary information

    Journal ref: J. R. Soc. Interface (2022) 19: 20220602

  32. arXiv:2207.11670  [pdf, other

    cs.NE cs.CV cs.LG q-bio.NC

    Training Stronger Spiking Neural Networks with Biomimetic Adaptive Internal Association Neurons

    Authors: Haibo Shen, Yihao Luo, Xiang Cao, Liangqi Zhang, Juyu Xiao, Tianjiang Wang

    Abstract: As the third generation of neural networks, spiking neural networks (SNNs) are dedicated to exploring more insightful neural mechanisms to achieve near-biological intelligence. Intuitively, biomimetic mechanisms are crucial to understanding and improving SNNs. For example, the associative long-term potentiation (ALTP) phenomenon suggests that in addition to learning mechanisms between neurons, the… ▽ More

    Submitted 13 March, 2023; v1 submitted 24 July, 2022; originally announced July 2022.

    Comments: Accepted by ICASSP 2023

  33. arXiv:2207.07650  [pdf, other

    cs.LG cs.AI q-bio.NC

    Contrastive Brain Network Learning via Hierarchical Signed Graph Pooling Model

    Authors: Haoteng Tang, Guixiang Ma, Lei Guo, Xiyao Fu, Heng Huang, Liang Zhang

    Abstract: Recently brain networks have been widely adopted to study brain dynamics, brain development and brain diseases. Graph representation learning techniques on brain functional networks can facilitate the discovery of novel biomarkers for clinical phenotypes and neurodegenerative diseases. However, current graph learning techniques have several issues on brain network mining. Firstly, most current gra… ▽ More

    Submitted 14 July, 2022; originally announced July 2022.

  34. arXiv:2207.02629  [pdf, ps, other

    q-bio.PE

    Can Multiple Phylogenetic Trees Be Displayed in a Tree-Child Network Simultaneously?

    Authors: Yufeng Wu, Louxin Zhang

    Abstract: A binary phylogenetic network on a taxon set $X$ is a rooted acyclic digraph in which the degree of each nonleaf node is three and its leaves (i.e.degree-one nodes) are uniquely labeled with the taxa of $X$. It is tree-child if each nonleaf node has at least one child of indegree one. A set of binary phylogenetic trees may or may not be simultaneously displayed in a binary tree-child network. Nece… ▽ More

    Submitted 6 July, 2022; originally announced July 2022.

    Comments: 17 pages, 7 figures

    MSC Class: 05C30 ACM Class: J.3

  35. arXiv:2206.14794  [pdf, ps, other

    q-bio.BM cs.DS physics.bio-ph q-bio.QM

    LinearAlifold: Linear-Time Consensus Structure Prediction for RNA Alignments

    Authors: Apoorv Malik, Liang Zhang, Milan Gautam, Ning Dai, Sizhen Li, He Zhang, David H. Mathews, Liang Huang

    Abstract: Predicting the consensus structure of a set of aligned RNA homologs is a convenient method to find conserved structures in an RNA genome, which has many applications including viral diagnostics and therapeutics. However, the most commonly used tool for this task, RNAalifold, is prohibitively slow for long sequences, due to a cubic scaling with the sequence length, taking over a day on 400 SARS-CoV… ▽ More

    Submitted 5 July, 2024; v1 submitted 29 June, 2022; originally announced June 2022.

  36. arXiv:2206.04903  [pdf

    physics.bio-ph physics.app-ph q-bio.QM q-bio.TO

    MorphoSim: An efficient and scalable phase-field framework for accurately simulating multicellular morphologies

    Authors: Xiangyu Kuang, Guoye Guan, Chao Tang, Lei Zhang

    Abstract: The phase field model can accurately simulate the evolution of microstructures with complex morphologies, and it has been widely used for cell modeling in the last two decades. However, compared to other cellular models such as the coarse-grained model and the vertex model, its high computational cost caused by three-dimensional spatial discretization hampered its application and scalability, espe… ▽ More

    Submitted 10 June, 2022; originally announced June 2022.

    Comments: 29 pages, 6 figures, 3 tables, 9 supplemental figures, 4 supplemental tables, 7 supplemental movies

  37. arXiv:2205.13644  [pdf

    q-bio.NC

    Representing Brain Anatomical Regularity and Variability by Few-Shot Embedding

    Authors: Lu Zhang, Xiaowei Yu, Yanjun Lyu, Zhengwang Wu, Haixing Dai, Lin Zhao, Li Wang, Gang Li, Tianming Liu, Dajiang Zhu

    Abstract: Effective representation of brain anatomical architecture is fundamental in understanding brain regularity and variability. Despite numerous efforts, it is still difficult to infer reliable anatomical correspondence at finer scale, given the tremendous individual variability in cortical folding patterns. It is even more challenging to disentangle common and individual patterns when comparing brain… ▽ More

    Submitted 26 May, 2022; originally announced May 2022.

  38. arXiv:2204.09225  [pdf

    q-bio.NC cs.AI

    Disentangling Spatial-Temporal Functional Brain Networks via Twin-Transformers

    Authors: Xiaowei Yu, Lu Zhang, Lin Zhao, Yanjun Lyu, Tianming Liu, Dajiang Zhu

    Abstract: How to identify and characterize functional brain networks (BN) is fundamental to gain system-level insights into the mechanisms of brain organizational architecture. Current functional magnetic resonance (fMRI) analysis highly relies on prior knowledge of specific patterns in either spatial (e.g., resting-state network) or temporal (e.g., task stimulus) domain. In addition, most approaches aim to… ▽ More

    Submitted 26 June, 2022; v1 submitted 20 April, 2022; originally announced April 2022.

    Comments: full pages

  39. arXiv:2204.02513  [pdf

    q-bio.BM cs.LG physics.bio-ph

    In-Pocket 3D Graphs Enhance Ligand-Target Compatibility in Generative Small-Molecule Creation

    Authors: Seung-gu Kang, Jeffrey K. Weber, Joseph A. Morrone, Leili Zhang, Tien Huynh, Wendy D. Cornell

    Abstract: Proteins in complex with small molecule ligands represent the core of structure-based drug discovery. However, three-dimensional representations are absent from most deep-learning-based generative models. We here present a graph-based generative modeling technology that encodes explicit 3D protein-ligand contacts within a relational graph architecture. The models combine a conditional variational… ▽ More

    Submitted 5 April, 2022; originally announced April 2022.

    Comments: 5 pages, 3 figures

  40. arXiv:2201.10958  [pdf, other

    q-bio.PE

    Two Results about the Sackin and Colless Indices for Phylogenetic Trees and Their Shapes

    Authors: Gary Goh, Michael Fuchs, Louxin Zhang

    Abstract: The Sackin and Colless indices are two widely-used metrics for measuring the balance of trees and for testing evolutionary models in phylogenetics. This short paper contributes two results about the Sackin and Colless indices of trees. One result is the asymptotic analysis of the expected Sackin and Colless indices of a tree shape (which are full binary rooted unlabelled trees) under the uniform m… ▽ More

    Submitted 18 July, 2022; v1 submitted 26 January, 2022; originally announced January 2022.

    Comments: 10 pages, 1 fugre

    MSC Class: 05A16; 05C30; 92D15

  41. arXiv:2201.09637  [pdf, other

    cs.LG cs.AI q-bio.QM

    DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise Annotations

    Authors: Yuanfeng Ji, Lu Zhang, Jiaxiang Wu, Bingzhe Wu, Long-Kai Huang, Tingyang Xu, Yu Rong, Lanqing Li, Jie Ren, Ding Xue, Houtim Lai, Shaoyong Xu, Jing Feng, Wei Liu, Ping Luo, Shuigeng Zhou, Junzhou Huang, Peilin Zhao, Yatao Bian

    Abstract: AI-aided drug discovery (AIDD) is gaining increasing popularity due to its promise of making the search for new pharmaceuticals quicker, cheaper and more efficient. In spite of its extensive use in many fields, such as ADMET prediction, virtual screening, protein folding and generative chemistry, little has been explored in terms of the out-of-distribution (OOD) learning problem with \emph{noise},… ▽ More

    Submitted 24 January, 2022; originally announced January 2022.

    Comments: 54 pages, 11 figures

  42. arXiv:2112.15379  [pdf, other

    q-bio.PE

    The Sackin Index of Simplex Networks

    Authors: Louxin Zhang

    Abstract: A phylogenetic network is a simplex (or 1-component tree-child) network if the child of every reticulation node is a network leaf. Simplex networks are a superclass of phylogenetic trees and a subclass of tree-child networks. Generalizing the Sackin index to phylogenetic networks, we prove that the expected Sackin index of a random simplex network is asymptotically $Ω(n^{7/4})$ in the uniform mode… ▽ More

    Submitted 31 December, 2021; originally announced December 2021.

    Comments: 19 pages, 2 figures

    MSC Class: 05A16; 05C30; 92D15

  43. arXiv:2111.05882  [pdf

    q-bio.QM cs.CV eess.IV

    A Histopathology Study Comparing Contrastive Semi-Supervised and Fully Supervised Learning

    Authors: Lantian Zhang, Mohamed Amgad, Lee A. D. Cooper

    Abstract: Data labeling is often the most challenging task when developing computational pathology models. Pathologist participation is necessary to generate accurate labels, and the limitations on pathologist time and demand for large, labeled datasets has led to research in areas including weakly supervised learning using patient-level labels, machine assisted annotation and active learning. In this paper… ▽ More

    Submitted 10 November, 2021; originally announced November 2021.

    Comments: 7 pages, 4 figures, 4 tables

  44. arXiv:2111.00287  [pdf, other

    q-bio.MN q-bio.GN

    CdtGRN: Construction of qualitative time-delayed gene regulatory networks with a deep learning method

    Authors: Ruijie Xu, Lin Zhang, Yu Chen

    Abstract: Background:Gene regulations often change over time rather than being constant. But many of gene regulatory networks extracted from databases are static. The tumor suppressor gene $P53$ is involved in the pathogenesis of many tumors, and its inhibition effects occur after a certain period. Therefore, it is of great significance to elucidate the regulation mechanism over time points. Result:A qualit… ▽ More

    Submitted 30 October, 2021; originally announced November 2021.

  45. arXiv:2110.07787  [pdf

    q-bio.QM q-bio.GN

    Analysis and visualization of spatial transcriptomic data

    Authors: Boxiang Liu, Yanjun Li, Liang Zhang

    Abstract: Human and animal tissues consist of heterogeneous cell types that organize and interact in highly structured manners. Bulk and single-cell sequencing technologies remove cells from their original microenvironments, resulting in a loss of spatial information. Spatial transcriptomics is a recent technological innovation that measures transcriptomic information while preserving spatial information. S… ▽ More

    Submitted 5 February, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

    Comments: 22 pages, 4 figures, 2 tables

    Journal ref: Frontiers in Genetics 12, 1-22 (2022)

  46. arXiv:2109.00809  [pdf

    q-bio.GN

    A review of computational tools for generating metagenome-assembled genomes from metagenomic sequencing data

    Authors: Chao Yang, Debajyoti Chowdhury, Zhenmiao Zhang, William K. Cheung, Aiping Lu, Zhao Xiang Bian, Lu Zhang

    Abstract: Microbes are essentially yet convolutedly linked with human lives on the earth. They critically interfere in different physiological processes and thus influence overall health status. Studying microbial species is used to be constrained to those that can be cultured in the lab. But it excluded a huge portion of the microbiome that could not survive on lab conditions. In the past few years, the cu… ▽ More

    Submitted 2 September, 2021; originally announced September 2021.

  47. arXiv:2106.07622  [pdf

    q-bio.NC

    Representative Functional Connectivity Learning for Multiple Clinical groups in Alzheimer's Disease

    Authors: Lu Zhang, Xiaowei Yu, Yanjun Lyu, Li Wang, Dajiang Zhu

    Abstract: Mild cognitive impairment (MCI) is a high-risk dementia condition which progresses to probable Alzheimer's disease (AD) at approximately 10% to 15% per year. Characterization of group-level differences between two subtypes of MCI - stable MCI (sMCI) and progressive MCI (pMCI) is the key step to understand the mechanisms of MCI progression and enable possible delay of transition from MCI to AD. Fun… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  48. arXiv:2103.10012  [pdf, ps, other

    q-bio.PE cs.SI physics.soc-ph

    Age-Stratified COVID-19 Spread Analysis and Vaccination: A Multitype Random Network Approach

    Authors: Xianhao Chen, Guangyu Zhu, Lan Zhang, Yuguang Fang, Linke Guo, Xinguang Chen

    Abstract: The risk for severe illness and mortality from COVID-19 significantly increases with age. As a result, age-stratified modeling for COVID-19 dynamics is the key to study how to reduce hospitalizations and mortality from COVID-19. By taking advantage of network theory, we develop an age-stratified epidemic model for COVID-19 in complex contact networks. Specifically, we present an extension of stand… ▽ More

    Submitted 18 March, 2021; originally announced March 2021.

    Comments: 11 pages, 9 figures

  49. arXiv:2102.06847  [pdf

    q-bio.NC cs.LG

    Disease2Vec: Representing Alzheimer's Progression via Disease Embedding Tree

    Authors: Lu Zhang, Li Wang, Tianming Liu, Dajiang Zhu

    Abstract: For decades, a variety of predictive approaches have been proposed and evaluated in terms of their prediction capability for Alzheimer's Disease (AD) and its precursor - mild cognitive impairment (MCI). Most of them focused on prediction or identification of statistical differences among different clinical groups or phases (e.g., longitudinal studies). The continuous nature of AD development and t… ▽ More

    Submitted 17 December, 2022; v1 submitted 12 February, 2021; originally announced February 2021.

    Comments: Submitted to Information Processing in Medical Imaging (IPMI) 2023

  50. arXiv:2012.00672  [pdf

    q-bio.QM

    Dynamics-based peptide-MHC binding optimization by a convolutional variational autoencoder: a use-case model for CASTELO

    Authors: David Bell, Giacomo Domeniconi, Chih-Chieh Yang, Ruhong Zhou, Leili Zhang, Guojing Cong

    Abstract: An unsolved challenge in the development of antigen specific immunotherapies is determining the optimal antigens to target. Comprehension of antigen-MHC binding is paramount towards achieving this goal. Here, we present CASTELO, a combined machine learning-molecular dynamics (ML-MD) approach to design novel antigens of increased MHC binding affinity for a Type 1 diabetes (T1D)-implicated system. W… ▽ More

    Submitted 8 December, 2020; v1 submitted 29 November, 2020; originally announced December 2020.