subscribe to arXiv mailings

A Unified Intracellular pH Landscape with SITE-pHorin: a Quantum-Entanglement-Enhanced pH Probe

Authors: Shu-Ang Li, Xiao-Yan Meng, Su Zhang, Ying-Jie Zhang, Run-Zhou Yang, Dian-Dian Wang, Yang Yang, Pei-Pei Liu, Jian-Sheng Kang

Abstract: An accurate map of intracellular organelle pH is crucial for comprehending cellular metabolism and organellar functions. However, a unified intracellular pH spectrum using a single probe is still lack. Here, we developed a novel quantum entanglement-enhanced pH-sensitive probe called SITE-pHorin, which featured a wide pH-sensitive range and ratiometric quantitative measurement capabilities. Subseq… ▽ More An accurate map of intracellular organelle pH is crucial for comprehending cellular metabolism and organellar functions. However, a unified intracellular pH spectrum using a single probe is still lack. Here, we developed a novel quantum entanglement-enhanced pH-sensitive probe called SITE-pHorin, which featured a wide pH-sensitive range and ratiometric quantitative measurement capabilities. Subsequently, we measured the pH of various organelles and their sub-compartments, including mitochondrial sub-spaces, Golgi stacks, endoplasmic reticulum, lysosomes, peroxisomes, and endosomes in COS-7 cells. For the long-standing debate on mitochondrial compartments pH, we measured the pH of mitochondrial cristae as 6.60 \pm 0.40, the pH of mitochondrial intermembrane space as 6.95 \pm 0.30, and two populations of mitochondrial matrix pH at approximately 7.20 \pm 0.27 and 7.50 \pm 0.16, respectively. Notably, the lysosome pH exhibited a single, narrow Gaussian distribution centered at 4.79 \pm 0.17. Furthermore, quantum chemistry computations revealed that both the deprotonation of the residue Y182 and the discrete curvature of deformed benzene ring in chromophore are both necessary for the quantum entanglement mechanism of SITE-pHorin. Intriguingly, our findings reveal an accurate pH gradient (0.6-0.9 pH unit) between mitochondrial cristae and matrix, suggesting prior knowledge about ΔpH (0.4-0.6) and mitochondrial proton motive force (pmf) are underestimated. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: 64 pages, 7 figures, the supplemental material contains 13 supplemental figures and 4 supplemental tables

arXiv:2406.17086 [pdf, other]

BrainMAE: A Region-aware Self-supervised Learning Framework for Brain Signals

Authors: Yifan Yang, Yutong Mao, Xufu Liu, Xiao Liu

Abstract: The human brain is a complex, dynamic network, which is commonly studied using functional magnetic resonance imaging (fMRI) and modeled as network of Regions of interest (ROIs) for understanding various brain functions. Recent studies utilize deep learning approaches to learn the brain network representation based on functional connectivity (FC) profile, broadly falling into two main categories. T… ▽ More The human brain is a complex, dynamic network, which is commonly studied using functional magnetic resonance imaging (fMRI) and modeled as network of Regions of interest (ROIs) for understanding various brain functions. Recent studies utilize deep learning approaches to learn the brain network representation based on functional connectivity (FC) profile, broadly falling into two main categories. The Fixed-FC approaches, utilizing the FC profile which represents the linear temporal relation within the brain network, are limited by failing to capture informative brain temporal dynamics. On the other hand, the Dynamic-FC approaches, modeling the evolving FC profile over time, often exhibit less satisfactory performance due to challenges in handling the inherent noisy nature of fMRI data. To address these challenges, we propose Brain Masked Auto-Encoder (BrainMAE) for learning representations directly from fMRI time-series data. Our approach incorporates two essential components: a region-aware graph attention mechanism designed to capture the relationships between different brain ROIs, and a novel self-supervised masked autoencoding framework for effective model pre-training. These components enable the model to capture rich temporal dynamics of brain activity while maintaining resilience to inherent noise in fMRI data. Our experiments demonstrate that BrainMAE consistently outperforms established baseline methods by significant margins in four distinct downstream tasks. Finally, leveraging the model's inherent interpretability, our analysis of model-generated representations reveals findings that resonate with ongoing research in the field of neuroscience. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 27 pages, 16 figures

MSC Class: 92-08 (Primary) 68T07; 68T05 (Secondary) ACM Class: J.3; I.5.4

arXiv:2406.10395 [pdf, other]

BrainFounder: Towards Brain Foundation Models for Neuroimage Analysis

Authors: Joseph Cox, Peng Liu, Skylar E. Stolte, Yunchao Yang, Kang Liu, Kyle B. See, Huiwen Ju, Ruogu Fang

Abstract: The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pret… ▽ More The burgeoning field of brain health research increasingly leverages artificial intelligence (AI) to interpret and analyze neurological data. This study introduces a novel approach towards the creation of medical foundation models by integrating a large-scale multi-modal magnetic resonance imaging (MRI) dataset derived from 41,400 participants in its own. Our method involves a novel two-stage pretraining approach using vision transformers. The first stage is dedicated to encoding anatomical structures in generally healthy brains, identifying key features such as shapes and sizes of different brain regions. The second stage concentrates on spatial information, encompassing aspects like location and the relative positioning of brain structures. We rigorously evaluate our model, BrainFounder, using the Brain Tumor Segmentation (BraTS) challenge and Anatomical Tracings of Lesions After Stroke v2.0 (ATLAS v2.0) datasets. BrainFounder demonstrates a significant performance gain, surpassing the achievements of the previous winning solutions using fully supervised learning. Our findings underscore the impact of scaling up both the complexity of the model and the volume of unlabeled training data derived from generally healthy brains, which enhances the accuracy and predictive capabilities of the model in complex neuroimaging tasks with MRI. The implications of this research provide transformative insights and practical applications in healthcare and make substantial steps towards the creation of foundation models for Medical AI. Our pretrained models and training code can be found at https://github.com/lab-smile/GatorBrain. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 17 pages, 5 figures, to be published in Medical Image Analysis

arXiv:2405.15206 [pdf, other]

Maximum Caliber Infers Effective Coupling and Response from Spiking Networks

Authors: Kevin S. Chen, Ying-Jen Yang

Abstract: The characterization of network and biophysical properties from neural spiking activity is an important goal in neuroscience. A framework that provides unbiased inference on causal synaptic interaction and single neural properties has been missing. Here we applied the stochastic dynamics extension of Maximum Entropy -- the Maximum Caliber Principle -- to infer the transition rates of network state… ▽ More The characterization of network and biophysical properties from neural spiking activity is an important goal in neuroscience. A framework that provides unbiased inference on causal synaptic interaction and single neural properties has been missing. Here we applied the stochastic dynamics extension of Maximum Entropy -- the Maximum Caliber Principle -- to infer the transition rates of network states. Effective synaptic coupling strength and neuronal response functions for various network motifs can then be computed. The inferred minimal model also enables leading-order reconstruction of inter-spike interval distribution. Our method is tested with numerical simulated spiking networks and applied to data from salamander retina. △ Less

Submitted 24 May, 2024; originally announced May 2024.

arXiv:2405.12144 [pdf]

Alterations of electrocortical activity during hand movements induced by motor cortex glioma

Authors: Yihan Wu, Tao Chang, Siliang Chen, Xiaodong Niu, Yu Li, Yuan Fang, Lei Yang, Yixuan Zong, Yaoxin Yang, Yuehua Li, Mengsong Wang, Wen Yang, Yixuan Wu, Chen Fu, Xia Fang, Yuxin Quan, Xilin Peng, Qiang Sun, Marc M. Van Hulle, Yanhui Liu, Ning Jiang, Dario Farina, Yuan Yang, Jiayuan He, Qing Mao

Abstract: Glioma cells can reshape functional neuronal networks by hijacking neuronal synapses, leading to partial or complete neurological dysfunction. These mechanisms have been previously explored for language functions. However, the impact of glioma on sensorimotor functions is still unknown. Therefore, we recruited a control group of patients with unaffected motor cortex and a group of patients with gl… ▽ More Glioma cells can reshape functional neuronal networks by hijacking neuronal synapses, leading to partial or complete neurological dysfunction. These mechanisms have been previously explored for language functions. However, the impact of glioma on sensorimotor functions is still unknown. Therefore, we recruited a control group of patients with unaffected motor cortex and a group of patients with glioma-infiltrated motor cortex, and recorded high-density electrocortical signals during finger movement tasks. The results showed that glioma suppresses task-related synchronization in the high-gamma band and reduces the power across all frequency bands. The resulting atypical motor information transmission model with discrete signaling pathways and delayed responses disrupts the stability of neuronal encoding patterns for finger movement kinematics across various temporal-spatial scales. These findings demonstrate that gliomas functionally invade neural circuits within the motor cortex. This result advances our understanding of motor function processing in chronic disease states, which is important to advance the surgical strategies and neurorehabilitation approaches for patients with malignant gliomas. △ Less

Submitted 20 May, 2024; originally announced May 2024.

arXiv:2404.00254 [pdf, other]

Clustering for Protein Representation Learning

Authors: Ruijie Quan, Wenguan Wang, Fan Ma, Hehe Fan, Yi Yang

Abstract: Protein representation learning is a challenging task that aims to capture the structure and function of proteins from their amino acid sequences. Previous methods largely ignored the fact that not all amino acids are equally important for protein folding and activity. In this article, we propose a neural clustering framework that can automatically discover the critical components of a protein by… ▽ More Protein representation learning is a challenging task that aims to capture the structure and function of proteins from their amino acid sequences. Previous methods largely ignored the fact that not all amino acids are equally important for protein folding and activity. In this article, we propose a neural clustering framework that can automatically discover the critical components of a protein by considering both its primary and tertiary structure information. Our framework treats a protein as a graph, where each node represents an amino acid and each edge represents a spatial or sequential connection between amino acids. We then apply an iterative clustering strategy to group the nodes into clusters based on their 1D and 3D positions and assign scores to each cluster. We select the highest-scoring clusters and use their medoid nodes for the next iteration of clustering, until we obtain a hierarchical and informative representation of the protein. We evaluate on four protein-related tasks: protein fold classification, enzyme reaction classification, gene ontology term prediction, and enzyme commission number prediction. Experimental results demonstrate that our method achieves state-of-the-art performance. △ Less

Submitted 30 March, 2024; originally announced April 2024.

Comments: Accepted to CVPR2024

arXiv:2403.17135 [pdf, ps, other]

Exploring the Generalization of Cancer Clinical Trial Eligibility Classifiers Across Diseases

Authors: Yumeng Yang, Ashley Gilliam, Ethan B Ludmir, Kirk Roberts

Abstract: Clinical trials are pivotal in medical research, and NLP can enhance their success, with application in recruitment. This study aims to evaluate the generalizability of eligibility classification across a broad spectrum of clinical trials. Starting with phase 3 cancer trials, annotated with seven eligibility exclusions, then to determine how well models can generalize to non-cancer and non-phase 3… ▽ More Clinical trials are pivotal in medical research, and NLP can enhance their success, with application in recruitment. This study aims to evaluate the generalizability of eligibility classification across a broad spectrum of clinical trials. Starting with phase 3 cancer trials, annotated with seven eligibility exclusions, then to determine how well models can generalize to non-cancer and non-phase 3 trials. To assess this, we have compiled eligibility criteria data for five types of trials: (1) additional phase 3 cancer trials, (2) phase 1 and 2 cancer trials, (3) heart disease trials, (4) type 2 diabetes trials, and (5) observational trials for any disease, comprising 2,490 annotated eligibility criteria across seven exclusion types. Our results show that models trained on the extensive cancer dataset can effectively handle criteria commonly found in non-cancer trials, such as autoimmune diseases. However, they struggle with criteria disproportionately prevalent in cancer trials, like prior malignancy. We also experiment with few-shot learning, demonstrating that a limited number of disease-specific examples can partially overcome this performance gap. We are releasing this new dataset of annotated eligibility statements to promote the development of cross-disease generalization in clinical trial classification. △ Less

Submitted 25 March, 2024; originally announced March 2024.

arXiv:2403.14358 [pdf, other]

Exploring the Potential of Large Language Models in Graph Generation

Authors: Yang Yao, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Xu Chu, Yuekui Yang, Wenwu Zhu, Hong Mei

Abstract: Large language models (LLMs) have achieved great success in many fields, and recent works have studied exploring LLMs for graph discriminative tasks such as node classification. However, the abilities of LLMs for graph generation remain unexplored in the literature. Graph generation requires the LLM to generate graphs with given properties, which has valuable real-world applications such as drug d… ▽ More Large language models (LLMs) have achieved great success in many fields, and recent works have studied exploring LLMs for graph discriminative tasks such as node classification. However, the abilities of LLMs for graph generation remain unexplored in the literature. Graph generation requires the LLM to generate graphs with given properties, which has valuable real-world applications such as drug discovery, while tends to be more challenging. In this paper, we propose LLM4GraphGen to explore the ability of LLMs for graph generation with systematical task designs and extensive experiments. Specifically, we propose several tasks tailored with comprehensive experiments to address key questions regarding LLMs' understanding of different graph structure rules, their ability to capture structural type distributions, and their utilization of domain knowledge for property-based graph generation. Our evaluations demonstrate that LLMs, particularly GPT-4, exhibit preliminary abilities in graph generation tasks, including rule-based and distribution-based generation. We also observe that popular prompting methods, such as few-shot and chain-of-thought prompting, do not consistently enhance performance. Besides, LLMs show potential in generating molecules with specific properties. These findings may serve as foundations for designing good LLMs based models for graph generation and provide valuable insights and further research. △ Less

Submitted 21 March, 2024; originally announced March 2024.

arXiv:2403.13829 [pdf, other]

DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization

Authors: Xiangxin Zhou, Xiwei Cheng, Yuwei Yang, Yu Bao, Liang Wang, Quanquan Gu

Abstract: Recently, 3D generative models have shown promising performances in structure-based drug design by learning to generate ligands given target binding sites. However, only modeling the target-ligand distribution can hardly fulfill one of the main goals in drug discovery -- designing novel ligands with desired properties, e.g., high binding affinity, easily synthesizable, etc. This challenge becomes… ▽ More Recently, 3D generative models have shown promising performances in structure-based drug design by learning to generate ligands given target binding sites. However, only modeling the target-ligand distribution can hardly fulfill one of the main goals in drug discovery -- designing novel ligands with desired properties, e.g., high binding affinity, easily synthesizable, etc. This challenge becomes particularly pronounced when the target-ligand pairs used for training do not align with these desired properties. Moreover, most existing methods aim at solving \textit{de novo} design task, while many generative scenarios requiring flexible controllability, such as R-group optimization and scaffold hopping, have received little attention. In this work, we propose DecompOpt, a structure-based molecular optimization method based on a controllable and decomposed diffusion model. DecompOpt presents a new generation paradigm which combines optimization with conditional diffusion models to achieve desired properties while adhering to the molecular grammar. Additionally, DecompOpt offers a unified framework covering both \textit{de novo} design and controllable generation. To achieve so, ligands are decomposed into substructures which allows fine-grained control and local optimization. Experiments show that DecompOpt can efficiently generate molecules with improved properties than strong de novo baselines, and demonstrate great potential in controllable generation tasks. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: Accepted to ICLR 2024

arXiv:2403.11375 [pdf, other]

Path-GPTOmic: A Balanced Multi-modal Learning Framework for Survival Outcome Prediction

Authors: Hongxiao Wang, Yang Yang, Zhuo Zhao, Pengfei Gu, Nishchal Sapkota, Danny Z. Chen

Abstract: For predicting cancer survival outcomes, standard approaches in clinical research are often based on two main modalities: pathology images for observing cell morphology features, and genomic (e.g., bulk RNA-seq) for quantifying gene expressions. However, existing pathology-genomic multi-modal algorithms face significant challenges: (1) Valuable biological insights regarding genes and gene-gene int… ▽ More For predicting cancer survival outcomes, standard approaches in clinical research are often based on two main modalities: pathology images for observing cell morphology features, and genomic (e.g., bulk RNA-seq) for quantifying gene expressions. However, existing pathology-genomic multi-modal algorithms face significant challenges: (1) Valuable biological insights regarding genes and gene-gene interactions are frequently overlooked; (2) one modality often dominates the optimization process, causing inadequate training for the other modality. In this paper, we introduce a new multi-modal ``Path-GPTOmic" framework for cancer survival outcome prediction. First, to extract valuable biological insights, we regulate the embedding space of a foundation model, scGPT, initially trained on single-cell RNA-seq data, making it adaptable for bulk RNA-seq data. Second, to address the imbalance-between-modalities problem, we propose a gradient modulation mechanism tailored to the Cox partial likelihood loss for survival prediction. The contributions of the modalities are dynamically monitored and adjusted during the training process, encouraging that both modalities are sufficiently trained. Evaluated on two TCGA(The Cancer Genome Atlas) datasets, our model achieves substantially improved survival prediction accuracy. △ Less

Submitted 17 March, 2024; originally announced March 2024.

Comments: Accepted by IEEE International Symposium on Biomedical Imaging (ISBI 2024)

arXiv:2403.08203 [pdf, other]

Learnable Community-Aware Transformer for Brain Connectome Analysis with Token Clustering

Authors: Yanting Yang, Beidi Zhao, Zhuohao Ni, Yize Zhao, Xiaoxiao Li

Abstract: Neuroscientific research has revealed that the complex brain network can be organized into distinct functional communities, each characterized by a cohesive group of regions of interest (ROIs) with strong interconnections. These communities play a crucial role in comprehending the functional organization of the brain and its implications for neurological conditions, including Autism Spectrum Disor… ▽ More Neuroscientific research has revealed that the complex brain network can be organized into distinct functional communities, each characterized by a cohesive group of regions of interest (ROIs) with strong interconnections. These communities play a crucial role in comprehending the functional organization of the brain and its implications for neurological conditions, including Autism Spectrum Disorder (ASD) and biological differences, such as in gender. Traditional models have been constrained by the necessity of predefined community clusters, limiting their flexibility and adaptability in deciphering the brain's functional organization. Furthermore, these models were restricted by a fixed number of communities, hindering their ability to accurately represent the brain's dynamic nature. In this study, we present a token clustering brain transformer-based model ($\texttt{TC-BrainTF}$) for joint community clustering and classification. Our approach proposes a novel token clustering (TC) module based on the transformer architecture, which utilizes learnable prompt tokens with orthogonal loss where each ROI embedding is projected onto the prompt embedding space, effectively clustering ROIs into communities and reducing the dimensions of the node representation via merging with communities. Our results demonstrate that our learnable community-aware model $\texttt{TC-BrainTF}$ offers improved accuracy in identifying ASD and classifying genders through rigorous testing on ABIDE and HCP datasets. Additionally, the qualitative analysis on $\texttt{TC-BrainTF}$ has demonstrated the effectiveness of the designed TC module and its relevance to neuroscience interpretations. △ Less

Submitted 12 March, 2024; originally announced March 2024.

arXiv:2403.07902 [pdf, other]

DecompDiff: Diffusion Models with Decomposed Priors for Structure-Based Drug Design

Authors: Jiaqi Guan, Xiangxin Zhou, Yuwei Yang, Yu Bao, Jian Peng, Jianzhu Ma, Qiang Liu, Liang Wang, Quanquan Gu

Abstract: Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the… ▽ More Designing 3D ligands within a target binding site is a fundamental task in drug discovery. Existing structured-based drug design methods treat all ligand atoms equally, which ignores different roles of atoms in the ligand for drug design and can be less efficient for exploring the large drug-like molecule space. In this paper, inspired by the convention in pharmaceutical practice, we decompose the ligand molecule into two parts, namely arms and scaffold, and propose a new diffusion model, DecompDiff, with decomposed priors over arms and scaffold. In order to facilitate the decomposed generation and improve the properties of the generated molecules, we incorporate both bond diffusion in the model and additional validity guidance in the sampling phase. Extensive experiments on CrossDocked2020 show that our approach achieves state-of-the-art performance in generating high-affinity molecules while maintaining proper molecular properties and conformational stability, with up to -8.39 Avg. Vina Dock score and 24.5 Success Rate. The code is provided at https://github.com/bytedance/DecompDiff △ Less

Submitted 26 February, 2024; originally announced March 2024.

Comments: Accepted to ICML 2023

arXiv:2403.05762 [pdf, other]

Lateral Control of Brain-Controlled Vehicle Based on SVM Probability Output Model

Authors: Hongguang Pan, Xinyu Yu, Yong Yang

Abstract: The non-stationary characteristics of EEG signal and the individual differences of brain-computer interfaces (BCIs) lead to poor performance in the control process of the brain-controlled vehicles (BCVs). In this paper, by combining steady-state visual evoked potential (SSVEP) interactive interface, brain instructions generation module and vehicle lateral control module, a probabilistic output mod… ▽ More The non-stationary characteristics of EEG signal and the individual differences of brain-computer interfaces (BCIs) lead to poor performance in the control process of the brain-controlled vehicles (BCVs). In this paper, by combining steady-state visual evoked potential (SSVEP) interactive interface, brain instructions generation module and vehicle lateral control module, a probabilistic output model based on support vector machine (SVM) is proposed for BCV lateral control to improve the driving performance. Firstly, a filter bank common spatial pattern (FBCSP) algorithm is introduced into the brain instructions generation module, which can improve the off-line decoding performance. Secondly, a sigmod-fitting SVM (SF-SVM) is trained based on the sigmod-fitting method and the lateral control module is developed, which can produce all commands in the form of probability instead of specific single command. Finally, a pre-experiment and two road-keeping experiments are conducted. In the pre-experiment, the experiment results show that, the average highest off-line accuracy among subjects is 95.64\%, while for those in the online stage, the average accuracy is only 84.44\%. In the road-keeping experiments, the task completion rate in the two designed scenes increased by 25.6\% and 20\%, respectively. △ Less

Submitted 8 March, 2024; originally announced March 2024.

arXiv:2403.01433 [pdf, other]

BrainMass: Advancing Brain Network Analysis for Diagnosis with Large-scale Self-Supervised Learning

Authors: Yanwu Yang, Chenfei Ye, Guinan Su, Ziyao Zhang, Zhikai Chang, Hairui Chen, Piu Chan, Yue Yu, Ting Ma

Abstract: Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there ha… ▽ More Foundation models pretrained on large-scale datasets via self-supervised learning demonstrate exceptional versatility across various tasks. Due to the heterogeneity and hard-to-collect medical data, this approach is especially beneficial for medical image analysis and neuroscience research, as it streamlines broad downstream tasks without the need for numerous costly annotations. However, there has been limited investigation into brain network foundation models, limiting their adaptability and generalizability for broad neuroscience studies. In this study, we aim to bridge this gap. In particular, (1) we curated a comprehensive dataset by collating images from 30 datasets, which comprises 70,781 samples of 46,686 participants. Moreover, we introduce pseudo-functional connectivity (pFC) to further generates millions of augmented brain networks by randomly dropping certain timepoints of the BOLD signal. (2) We propose the BrainMass framework for brain network self-supervised learning via mask modeling and feature alignment. BrainMass employs Mask-ROI Modeling (MRM) to bolster intra-network dependencies and regional specificity. Furthermore, Latent Representation Alignment (LRA) module is utilized to regularize augmented brain networks of the same participant with similar topological properties to yield similar latent representations by aligning their latent embeddings. Extensive experiments on eight internal tasks and seven external brain disorder diagnosis tasks show BrainMass's superior performance, highlighting its significant generalizability and adaptability. Nonetheless, BrainMass demonstrates powerful few/zero-shot learning abilities and exhibits meaningful interpretation to various diseases, showcasing its potential use for clinical applications. △ Less

Submitted 3 March, 2024; originally announced March 2024.

arXiv:2402.18583 [pdf, other]

Binding-Adaptive Diffusion Models for Structure-Based Drug Design

Authors: Zhilin Huang, Ling Yang, Zaixi Zhang, Xiangxin Zhou, Yu Bao, Xiawu Zheng, Yuwei Yang, Yu Wang, Wenming Yang

Abstract: Structure-based drug design (SBDD) aims to generate 3D ligand molecules that bind to specific protein targets. Existing 3D deep generative models including diffusion models have shown great promise for SBDD. However, it is complex to capture the essential protein-ligand interactions exactly in 3D space for molecular generation. To address this problem, we propose a novel framework, namely Binding-… ▽ More Structure-based drug design (SBDD) aims to generate 3D ligand molecules that bind to specific protein targets. Existing 3D deep generative models including diffusion models have shown great promise for SBDD. However, it is complex to capture the essential protein-ligand interactions exactly in 3D space for molecular generation. To address this problem, we propose a novel framework, namely Binding-Adaptive Diffusion Models (BindDM). In BindDM, we adaptively extract subcomplex, the essential part of binding sites responsible for protein-ligand interactions. Then the selected protein-ligand subcomplex is processed with SE(3)-equivariant neural networks, and transmitted back to each atom of the complex for augmenting the target-aware 3D molecule diffusion generation with binding interaction information. We iterate this hierarchical complex-subcomplex process with cross-hierarchy interaction node for adequately fusing global binding context between the complex and its corresponding subcomplex. Empirical studies on the CrossDocked2020 dataset show BindDM can generate molecules with more realistic 3D structures and higher binding affinities towards the protein targets, with up to -5.92 Avg. Vina Score, while maintaining proper molecular properties. Our code is available at https://github.com/YangLing0818/BindDM △ Less

Submitted 14 January, 2024; originally announced February 2024.

Comments: Accepted by AAAI 2024. Project: https://github.com/YangLing0818/BindDM

arXiv:2402.14315 [pdf, other]

Structure-Based Drug Design via 3D Molecular Generative Pre-training and Sampling

Authors: Yuwei Yang, Siqi Ouyang, Xueyu Hu, Mingyue Zheng, Hao Zhou, Lei Li

Abstract: Structure-based drug design aims at generating high affinity ligands with prior knowledge of 3D target structures. Existing methods either use conditional generative model to learn the distribution of 3D ligands given target binding sites, or iteratively modify molecules to optimize a structure-based activity estimator. The former is highly constrained by data quantity and quality, which leaves op… ▽ More Structure-based drug design aims at generating high affinity ligands with prior knowledge of 3D target structures. Existing methods either use conditional generative model to learn the distribution of 3D ligands given target binding sites, or iteratively modify molecules to optimize a structure-based activity estimator. The former is highly constrained by data quantity and quality, which leaves optimization-based approaches more promising in practical scenario. However, existing optimization-based approaches choose to edit molecules in 2D space, and use molecular docking to estimate the activity using docking predicted 3D target-ligand complexes. The misalignment between the action space and the objective hinders the performance of these models, especially for those employ deep learning for acceleration. In this work, we propose MolEdit3D to combine 3D molecular generation with optimization frameworks. We develop a novel 3D graph editing model to generate molecules using fragments, and pre-train this model on abundant 3D ligands for learning target-independent properties. Then we employ a target-guided self-learning strategy to improve target-related properties using self-sampled molecules. MolEdit3D achieves state-of-the-art performance on majority of the evaluation metrics, and demonstrate strong capability of capturing both target-dependent and -independent properties. △ Less

Submitted 15 March, 2024; v1 submitted 22 February, 2024; originally announced February 2024.

arXiv:2402.10251 [pdf, other]

Brant-2: Foundation Model for Brain Signals

Authors: Zhizhang Yuan, Daoze Zhang, Junru Chen, Gefei Gu, Yang Yang

Abstract: Foundational models benefit from pre-training on large amounts of unlabeled data and enable strong performance in a wide variety of applications with a small amount of labeled data. Such models can be particularly effective in analyzing brain signals, as this field encompasses numerous application scenarios, and it is costly to perform large-scale annotation. In this work, we present the largest f… ▽ More Foundational models benefit from pre-training on large amounts of unlabeled data and enable strong performance in a wide variety of applications with a small amount of labeled data. Such models can be particularly effective in analyzing brain signals, as this field encompasses numerous application scenarios, and it is costly to perform large-scale annotation. In this work, we present the largest foundation model in brain signals, Brant-2. Compared to Brant, a foundation model designed for intracranial neural signals, Brant-2 not only exhibits robustness towards data variations and modeling scales but also can be applied to a broader range of brain neural data. By experimenting on an extensive range of tasks, we demonstrate that Brant-2 is adaptive to various application scenarios in brain signals. Further analyses reveal the scalability of the Brant-2, validate each component's effectiveness, and showcase our model's ability to maintain performance in scenarios with scarce labels. △ Less

Submitted 28 March, 2024; v1 submitted 15 February, 2024; originally announced February 2024.

Comments: 14 pages, 7 figures

arXiv:2402.09649 [pdf, other]

ProtChatGPT: Towards Understanding Proteins with Large Language Models

Authors: Chao Wang, Hehe Fan, Ruijie Quan, Yi Yang

Abstract: Protein research is crucial in various fundamental disciplines, but understanding their intricate structure-function relationships remains challenging. Recent Large Language Models (LLMs) have made significant strides in comprehending task-specific knowledge, suggesting the potential for ChatGPT-like systems specialized in protein to facilitate basic research. In this work, we introduce ProtChatGP… ▽ More Protein research is crucial in various fundamental disciplines, but understanding their intricate structure-function relationships remains challenging. Recent Large Language Models (LLMs) have made significant strides in comprehending task-specific knowledge, suggesting the potential for ChatGPT-like systems specialized in protein to facilitate basic research. In this work, we introduce ProtChatGPT, which aims at learning and understanding protein structures via natural languages. ProtChatGPT enables users to upload proteins, ask questions, and engage in interactive conversations to produce comprehensive answers. The system comprises protein encoders, a Protein-Language Pertaining Transformer (PLP-former), a projection adapter, and an LLM. The protein first undergoes protein encoders and PLP-former to produce protein embeddings, which are then projected by the adapter to conform with the LLM. The LLM finally combines user questions with projected embeddings to generate informative answers. Experiments show that ProtChatGPT can produce promising responses to proteins and their corresponding questions. We hope that ProtChatGPT could form the basis for further exploration and application in protein research. Code and our pre-trained model will be publicly available. △ Less

Submitted 14 February, 2024; originally announced February 2024.

arXiv:2312.10892 [pdf, other]

Deep Learning-based MRI Reconstruction with Artificial Fourier Transform (AFT)-Net

Authors: Yanting Yang, Jeffery Siyuan Tian, Matthieu Dagommer, Jia Guo

Abstract: The deep complex-valued neural network provides a powerful way to leverage complex number operations and representations, which has succeeded in several phase-based applications. However, most previously published networks have not fully accessed the impact of complex-valued networks in the frequency domain. Here, we introduced a unified complex-valued deep learning framework - artificial Fourier… ▽ More The deep complex-valued neural network provides a powerful way to leverage complex number operations and representations, which has succeeded in several phase-based applications. However, most previously published networks have not fully accessed the impact of complex-valued networks in the frequency domain. Here, we introduced a unified complex-valued deep learning framework - artificial Fourier transform network (AFT-Net) - which combined domain-manifold learning and complex-valued neural networks. The AFT-Net can be readily used to solve the image inverse problems in domain-transform, especially for accelerated magnetic resonance imaging (MRI) reconstruction and other applications. While conventional methods only accept magnitude images, the proposed method takes raw k-space data in the frequency domain as inputs, allowing a mapping between the k-space domain and the image domain to be determined through cross-domain learning. We show that AFT-Net achieves superior accelerated MRI reconstruction and is comparable to existing approaches. Also, our approach can be applied to different tasks like denoised MRS reconstruction and different datasets with various contrasts. The AFT-Net presented here is a valuable preprocessing component for different preclinical studies and provides an innovative alternative for solving inverse problems in imaging and spectroscopy. △ Less

Submitted 17 December, 2023; originally announced December 2023.

arXiv:2311.18377 [pdf]

Transfer Learning across Different Chemical Domains: Virtual Screening of Organic Materials with Deep Learning Models Pretrained on Small Molecule and Chemical Reaction Data

Authors: Chengwei Zhang, Yushuang Zhai, Ziyang Gong, Hongliang Duan, Yuan-Bin She, Yun-Fang Yang, An Su

Abstract: Machine learning is becoming a preferred method for the virtual screening of organic materials due to its cost-effectiveness over traditional computationally demanding techniques. However, the scarcity of labeled data for organic materials poses a significant challenge for training advanced machine learning models. This study showcases the potential of utilizing databases of drug-like small molecu… ▽ More Machine learning is becoming a preferred method for the virtual screening of organic materials due to its cost-effectiveness over traditional computationally demanding techniques. However, the scarcity of labeled data for organic materials poses a significant challenge for training advanced machine learning models. This study showcases the potential of utilizing databases of drug-like small molecules and chemical reactions to pretrain the BERT model, enhancing its performance in the virtual screening of organic materials. By fine-tuning the BERT models with data from five virtual screening tasks, the version pretrained with the USPTO-SMILES dataset achieved R2 scores exceeding 0.94 for three tasks and over 0.81 for two others. This performance surpasses that of models pretrained on the small molecule or organic materials databases and outperforms three traditional machine learning models trained directly on virtual screening data. The success of the USPTO-SMILES pretrained BERT model can be attributed to the diverse array of organic building blocks in the USPTO database, offering a broader exploration of the chemical space. The study further suggests that accessing a reaction database with a wider range of reactions than the USPTO could further enhance model performance. Overall, this research validates the feasibility of applying transfer learning across different chemical domains for the efficient virtual screening of organic materials. △ Less

Submitted 5 March, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

arXiv:2310.17152 [pdf]

Technical Note: Feasibility of translating 3.0T-trained Deep-Learning Segmentation Models Out-of-the-Box on Low-Field MRI 0.55T Knee-MRI of Healthy Controls

Authors: Rupsa Bhattacharjee, Zehra Akkaya, Johanna Luitjens, Pan Su, Yang Yang, Valentina Pedoia, Sharmila Majumdar

Abstract: In the current study, our purpose is to evaluate the feasibility of applying deep learning (DL) enabled algorithms to quantify bilateral knee biomarkers in healthy controls scanned at 0.55T and compared with 3.0T. The current study assesses the performance of standard in-practice bone, and cartilage segmentation algorithms at 0.55T, both qualitatively and quantitatively, in terms of comparing segm… ▽ More In the current study, our purpose is to evaluate the feasibility of applying deep learning (DL) enabled algorithms to quantify bilateral knee biomarkers in healthy controls scanned at 0.55T and compared with 3.0T. The current study assesses the performance of standard in-practice bone, and cartilage segmentation algorithms at 0.55T, both qualitatively and quantitatively, in terms of comparing segmentation performance, areas of improvement, and compartment-wise cartilage thickness values between 0.55T vs. 3.0T. Initial results demonstrate a usable to good technical feasibility of translating existing quantitative deep-learning-based image segmentation techniques, trained on 3.0T, out of 0.55T for knee MRI, in a multi-vendor acquisition environment. Especially in terms of segmenting cartilage compartments, the models perform almost equivalent to 3.0T in terms of Likert ranking. The 0.55T low-field sustainable and easy-to-install MRI, as demonstrated, thus, can be utilized for evaluating knee cartilage thickness and bone segmentations aided by established DL algorithms trained at higher-field strengths out-of-the-box initially. This could be utilized at the far-spread point-of-care locations with a lack of radiologists available to manually segment low-field images, at least till a decent base of low-field data pool is collated. With further fine-tuning with manual labeling of low-field data or utilizing synthesized higher SNR images from low-field images, OA biomarker quantification performance is potentially guaranteed to be further improved. △ Less

Submitted 26 October, 2023; originally announced October 2023.

Comments: 11 Pages, 3 Figures, 2 Tables

arXiv:2310.10138 [pdf, other]

Node-based Knowledge Graph Contrastive Learning for Medical Relationship Prediction

Authors: Zhiguang Fan, Yuedong Yang, Mingyuan Xu, Hongming Chen

Abstract: The embedding of Biomedical Knowledge Graphs (BKGs) generates robust representations, valuable for a variety of artificial intelligence applications, including predicting drug combinations and reasoning disease-drug relationships. Meanwhile, contrastive learning (CL) is widely employed to enhance the distinctiveness of these representations. However, constructing suitable contrastive pairs for CL,… ▽ More The embedding of Biomedical Knowledge Graphs (BKGs) generates robust representations, valuable for a variety of artificial intelligence applications, including predicting drug combinations and reasoning disease-drug relationships. Meanwhile, contrastive learning (CL) is widely employed to enhance the distinctiveness of these representations. However, constructing suitable contrastive pairs for CL, especially within Knowledge Graphs (KGs), has been challenging. In this paper, we proposed a novel node-based contrastive learning method for knowledge graph embedding, NC-KGE. NC-KGE enhances knowledge extraction in embeddings and speeds up training convergence by constructing appropriate contrastive node pairs on KGs. This scheme can be easily integrated with other knowledge graph embedding (KGE) methods. For downstream task such as biochemical relationship prediction, we have incorporated a relation-aware attention mechanism into NC-KGE, focusing on the semantic relationships and node interactions. Extensive experiments show that NC-KGE performs competitively with state-of-the-art models on public datasets like FB15k-237 and WN18RR. Particularly in biomedical relationship prediction tasks, NC-KGE outperforms all baselines on datasets such as PharmKG8k-28, DRKG17k-21, and BioKG72k-14, especially in predicting drug combination relationships. We release our code at https://github.com/zhi520/NC-KGE. △ Less

Submitted 16 October, 2023; originally announced October 2023.

Comments: 10 pages,5 figures,conference

arXiv:2310.02546 [pdf, other]

Joint Design of Protein Sequence and Structure based on Motifs

Authors: Zhenqiao Song, Yunlong Zhao, Yufei Song, Wenxian Shi, Yang Yang, Lei Li

Abstract: Designing novel proteins with desired functions is crucial in biology and chemistry. However, most existing work focus on protein sequence design, leaving protein sequence and structure co-design underexplored. In this paper, we propose GeoPro, a method to design protein backbone structure and sequence jointly. Our motivation is that protein sequence and its backbone structure constrain each other… ▽ More Designing novel proteins with desired functions is crucial in biology and chemistry. However, most existing work focus on protein sequence design, leaving protein sequence and structure co-design underexplored. In this paper, we propose GeoPro, a method to design protein backbone structure and sequence jointly. Our motivation is that protein sequence and its backbone structure constrain each other, and thus joint design of both can not only avoid nonfolding and misfolding but also produce more diverse candidates with desired functions. To this end, GeoPro is powered by an equivariant encoder for three-dimensional (3D) backbone structure and a protein sequence decoder guided by 3D geometry. Experimental results on two biologically significant metalloprotein datasets, including $β$-lactamases and myoglobins, show that our proposed GeoPro outperforms several strong baselines on most metrics. Remarkably, our method discovers novel $β$-lactamases and myoglobins which are not present in protein data bank (PDB) and UniProt. These proteins exhibit stable folding and active site environments reminiscent of those of natural proteins, demonstrating their excellent potential to be biologically functional. △ Less

Submitted 3 October, 2023; originally announced October 2023.

arXiv:2308.15116 [pdf, other]

Mixup-Augmented Meta-Learning for Sample-Efficient Fine-Tuning of Protein Simulators

Authors: Jingbang Chen, Yian Wang, Xingwei Qu, Shuangjia Zheng, Yaodong Yang, Hao Dong, Jie Fu

Abstract: Molecular dynamics simulations have emerged as a fundamental instrument for studying biomolecules. At the same time, it is desirable to perform simulations of a collection of particles under various conditions in which the molecules can fluctuate. In this paper, we explore and adapt the soft prompt-based learning method to molecular dynamics tasks. Our model can remarkably generalize to unseen and… ▽ More Molecular dynamics simulations have emerged as a fundamental instrument for studying biomolecules. At the same time, it is desirable to perform simulations of a collection of particles under various conditions in which the molecules can fluctuate. In this paper, we explore and adapt the soft prompt-based learning method to molecular dynamics tasks. Our model can remarkably generalize to unseen and out-of-distribution scenarios with limited training data. While our work focuses on temperature as a test case, the versatility of our approach allows for efficient simulation through any continuous dynamic conditions, such as pressure and volumes. Our framework has two stages: 1) Pre-trains with data mixing technique, augments molecular structure data and temperature prompts, then applies a curriculum learning method by increasing the ratio of them smoothly. 2) Meta-learning-based fine-tuning framework improves sample-efficiency of fine-tuning process and gives the soft prompt-tuning better initialization points. Comprehensive experiments reveal that our framework excels in accuracy for in-domain data and demonstrates strong generalization capabilities for unseen and out-of-distribution samples. △ Less

Submitted 9 October, 2023; v1 submitted 29 August, 2023; originally announced August 2023.

arXiv:2308.00237 [pdf, other]

EC-Conf: An Ultra-fast Diffusion Model for Molecular Conformation Generation with Equivariant Consistency

Authors: Zhiguang Fan, Yuedong Yang, Mingyuan Xu, Hongming Chen

Abstract: Despite recent advancement in 3D molecule conformation generation driven by diffusion models, its high computational cost in iterative diffusion/denoising process limits its application. In this paper, an equivariant consistency model (EC-Conf) was proposed as a fast diffusion method for low-energy conformation generation. In EC-Conf, a modified SE (3)-equivariant transformer model was directly us… ▽ More Despite recent advancement in 3D molecule conformation generation driven by diffusion models, its high computational cost in iterative diffusion/denoising process limits its application. In this paper, an equivariant consistency model (EC-Conf) was proposed as a fast diffusion method for low-energy conformation generation. In EC-Conf, a modified SE (3)-equivariant transformer model was directly used to encode the Cartesian molecular conformations and a highly efficient consistency diffusion process was carried out to generate molecular conformations. It was demonstrated that, with only one sampling step, it can already achieve comparable quality to other diffusion-based models running with thousands denoising steps. Its performance can be further improved with a few more sampling iterations. The performance of EC-Conf is evaluated on both GEOM-QM9 and GEOM-Drugs sets. Our results demonstrate that the efficiency of EC-Conf for learning the distribution of low energy molecular conformation is at least two magnitudes higher than current SOTA diffusion models and could potentially become a useful tool for conformation generation and sampling. We release our code at https://github.com/zhi520/EcConf. △ Less

Submitted 23 November, 2023; v1 submitted 31 July, 2023; originally announced August 2023.

Comments: 10 pages, 3 figures

arXiv:2306.10070 [pdf]

doi 10.1093/bib/bbad493

Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health

Authors: Shubo Tian, Qiao Jin, Lana Yeganova, Po-Ting Lai, Qingqing Zhu, Xiuying Chen, Yifan Yang, Qingyu Chen, Won Kim, Donald C. Comeau, Rezarta Islamaj, Aadit Kapoor, Xin Gao, Zhiyong Lu

Abstract: ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically we explore the… ▽ More ChatGPT has drawn considerable attention from both the general public and domain experts with its remarkable text generation capabilities. This has subsequently led to the emergence of diverse applications in the field of biomedicine and health. In this work, we examine the diverse applications of large language models (LLMs), such as ChatGPT, in biomedicine and health. Specifically we explore the areas of biomedical information retrieval, question answering, medical text summarization, information extraction, and medical education, and investigate whether LLMs possess the transformative power to revolutionize these tasks or whether the distinct complexities of biomedical domain presents unique challenges. Following an extensive literature survey, we find that significant advances have been made in the field of text generation tasks, surpassing the previous state-of-the-art methods. For other applications, the advances have been modest. Overall, LLMs have not yet revolutionized biomedicine, but recent rapid progress indicates that such methods hold great potential to provide valuable means for accelerating discovery and improving health. We also find that the use of LLMs, like ChatGPT, in the fields of biomedicine and health entails various risks and challenges, including fabricated information in its generated responses, as well as legal and privacy concerns associated with sensitive patient data. We believe this survey can provide a comprehensive and timely overview to biomedical researchers and healthcare practitioners on the opportunities and challenges associated with using ChatGPT and other LLMs for transforming biomedicine and health. △ Less

Submitted 16 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

arXiv:2305.18090 [pdf, other]

ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback

Authors: Shengchao Liu, Jiongxiao Wang, Yijin Yang, Chengpeng Wang, Ling Liu, Hongyu Guo, Chaowei Xiao

Abstract: Recent advancements in conversational large language models (LLMs), such as ChatGPT, have demonstrated remarkable promise in various domains, including drug discovery. However, existing works mainly focus on investigating the capabilities of conversational LLMs on chemical reaction and retrosynthesis. While drug editing, a critical task in the drug discovery pipeline, remains largely unexplored. T… ▽ More Recent advancements in conversational large language models (LLMs), such as ChatGPT, have demonstrated remarkable promise in various domains, including drug discovery. However, existing works mainly focus on investigating the capabilities of conversational LLMs on chemical reaction and retrosynthesis. While drug editing, a critical task in the drug discovery pipeline, remains largely unexplored. To bridge this gap, we propose ChatDrug, a framework to facilitate the systematic investigation of drug editing using LLMs. ChatDrug jointly leverages a prompt module, a retrieval and domain feedback (ReDF) module, and a conversation module to streamline effective drug editing. We empirically show that ChatDrug reaches the best performance on 33 out of 39 drug editing tasks, encompassing small molecules, peptides, and proteins. We further demonstrate, through 10 case studies, that ChatDrug can successfully identify the key substructures (e.g., the molecule functional groups, peptide motifs, and protein structures) for manipulation, generating diverse and valid suggestions for drug editing. Promisingly, we also show that ChatDrug can offer insightful explanations from a domain-specific perspective, enhancing interpretability and enabling informed decision-making. This research sheds light on the potential of ChatGPT and conversational LLMs for drug editing. It paves the way for a more efficient and collaborative drug discovery pipeline, contributing to the advancement of pharmaceutical research and development. △ Less

Submitted 29 May, 2023; originally announced May 2023.

arXiv:2305.14376 [pdf, other]

PTGB: Pre-Train Graph Neural Networks for Brain Network Analysis

Authors: Yi Yang, Hejie Cui, Carl Yang

Abstract: The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways. Recent advances in neuroscience and neuroimaging analysis have shown a growing interest in the interactions between brain regions of interest (ROIs) and their impact on neural development and disorder diagnosis. As a powerful deep model for analyzing graph-structured data, Graph Ne… ▽ More The human brain is the central hub of the neurobiological system, controlling behavior and cognition in complex ways. Recent advances in neuroscience and neuroimaging analysis have shown a growing interest in the interactions between brain regions of interest (ROIs) and their impact on neural development and disorder diagnosis. As a powerful deep model for analyzing graph-structured data, Graph Neural Networks (GNNs) have been applied for brain network analysis. However, training deep models requires large amounts of labeled data, which is often scarce in brain network datasets due to the complexities of data acquisition and sharing restrictions. To make the most out of available training data, we propose PTGB, a GNN pre-training framework that captures intrinsic brain network structures, regardless of clinical outcomes, and is easily adaptable to various downstream tasks. PTGB comprises two key components: (1) an unsupervised pre-training technique designed specifically for brain networks, which enables learning from large-scale datasets without task-specific labels; (2) a data-driven parcellation atlas mapping pipeline that facilitates knowledge transfer across datasets with different ROI systems. Extensive evaluations using various GNN models have demonstrated the robust and superior performance of PTGB compared to baseline methods. △ Less

Submitted 20 May, 2023; originally announced May 2023.

Comments: Accepted to CHIL 2023, 19 pages

arXiv:2305.09156 [pdf, other]

Modelling Human Visual Motion Processing with Trainable Motion Energy Sensing and a Self-attention Network

Authors: Zitang Sun, Yen-Ju Chen, Yung-hao Yang, Shin'ya Nishida

Abstract: Visual motion processing is essential for humans to perceive and interact with dynamic environments. Despite extensive research in cognitive neuroscience, image-computable models that can extract informative motion flow from natural scenes in a manner consistent with human visual processing have yet to be established. Meanwhile, recent advancements in computer vision (CV), propelled by deep learni… ▽ More Visual motion processing is essential for humans to perceive and interact with dynamic environments. Despite extensive research in cognitive neuroscience, image-computable models that can extract informative motion flow from natural scenes in a manner consistent with human visual processing have yet to be established. Meanwhile, recent advancements in computer vision (CV), propelled by deep learning, have led to significant progress in optical flow estimation, a task closely related to motion perception. Here we propose an image-computable model of human motion perception by bridging the gap between biological and CV models. Specifically, we introduce a novel two-stages approach that combines trainable motion energy sensing with a recurrent self-attention network for adaptive motion integration and segregation. This model architecture aims to capture the computations in V1-MT, the core structure for motion perception in the biological visual system, while providing the ability to derive informative motion flow for a wide range of stimuli, including complex natural scenes. In silico neurophysiology reveals that our model's unit responses are similar to mammalian neural recordings regarding motion pooling and speed tuning. The proposed model can also replicate human responses to a range of stimuli examined in past psychophysical studies. The experimental results on the Sintel benchmark demonstrate that our model predicts human responses better than the ground truth, whereas the state-of-the-art CV models show the opposite. Our study provides a computational architecture consistent with human visual motion processing, although the physiological correspondence may not be exact. △ Less

Submitted 9 November, 2023; v1 submitted 16 May, 2023; originally announced May 2023.

Comments: accepted by NeurIPS 2023

arXiv:2304.09667 [pdf, other]

doi 10.1093/bioinformatics/btae075

GeneGPT: Augmenting Large Language Models with Domain Tools for Improved Access to Biomedical Information

Authors: Qiao Jin, Yifan Yang, Qingyu Chen, Zhiyong Lu

Abstract: While large language models (LLMs) have been successfully applied to various tasks, they still face challenges with hallucinations. Augmenting LLMs with domain-specific tools such as database utilities can facilitate easier and more precise access to specialized knowledge. In this paper, we present GeneGPT, a novel method for teaching LLMs to use the Web APIs of the National Center for Biotechnolo… ▽ More While large language models (LLMs) have been successfully applied to various tasks, they still face challenges with hallucinations. Augmenting LLMs with domain-specific tools such as database utilities can facilitate easier and more precise access to specialized knowledge. In this paper, we present GeneGPT, a novel method for teaching LLMs to use the Web APIs of the National Center for Biotechnology Information (NCBI) for answering genomics questions. Specifically, we prompt Codex to solve the GeneTuring tests with NCBI Web APIs by in-context learning and an augmented decoding algorithm that can detect and execute API calls. Experimental results show that GeneGPT achieves state-of-the-art performance on eight tasks in the GeneTuring benchmark with an average score of 0.83, largely surpassing retrieval-augmented LLMs such as the new Bing (0.44), biomedical LLMs such as BioMedLM (0.08) and BioGPT (0.04), as well as GPT-3 (0.16) and ChatGPT (0.12). Our further analyses suggest that: (1) API demonstrations have good cross-task generalizability and are more useful than documentations for in-context learning; (2) GeneGPT can generalize to longer chains of API calls and answer multi-hop questions in GeneHop, a novel dataset introduced in this work; (3) Different types of errors are enriched in different tasks, providing valuable insights for future improvements. △ Less

Submitted 16 May, 2023; v1 submitted 19 April, 2023; originally announced April 2023.

Journal ref: Bioinformatics, 2024

arXiv:2304.07295 [pdf]

Experts' cognition-driven safe noisy labels learning for precise segmentation of residual tumor in breast cancer

Authors: Yongquan Yang, Jie Chen, Yani Wei, Mohammad Alobaidi, Hong Bu

Abstract: Precise segmentation of residual tumor in breast cancer (PSRTBC) after neoadjuvant chemotherapy is a fundamental key technique in the treatment process of breast cancer. However, achieving PSRTBC is still a challenge, since the breast cancer tissue and tumor cells commonly have complex and varied morphological changes after neoadjuvant chemotherapy, which inevitably increases the difficulty to pro… ▽ More Precise segmentation of residual tumor in breast cancer (PSRTBC) after neoadjuvant chemotherapy is a fundamental key technique in the treatment process of breast cancer. However, achieving PSRTBC is still a challenge, since the breast cancer tissue and tumor cells commonly have complex and varied morphological changes after neoadjuvant chemotherapy, which inevitably increases the difficulty to produce a predictive model that has good generalization with machine learning. To alleviate this situation, in this paper, we propose an experts' cognition-driven safe noisy labels learning (ECDSNLL) approach. In the concept of safe noisy labels learning, which is a typical type of safe weakly supervised learning, ECDSNLL is constructed by integrating the pathology experts' cognition about identifying residual tumor in breast cancer and the artificial intelligence experts' cognition about data modeling with provided data basis. We show the advantages of the proposed ECDSNLL approach and its promising potentials in addressing PSRTBC. We also release a better predictive model for achieving PSRTBC, which can be leveraged to promote the development of related application software. △ Less

Submitted 12 April, 2023; originally announced April 2023.

arXiv:2302.05511 [pdf]

Physiologically-Based Pharmacokinetic Modeling of Blood Clearance of Liver Fluorescent Markers for the Assessment of the Degree of Hepatic Ischemia-Reperfusion Injury

Authors: Christopher Monti, Said H. Audi, Justin Womack, Seung-Keun Hong, Yongqiang Yang, Joohyun Kim, Ranjan K. Dash

Abstract: During liver transplantation, ischemia-reperfusion injury (IRI) is inevitable and decreases the overall success of the surgery. While guidelines exist, there is no reliable way to quantitatively assess the degree of IRI present in the liver. Our recent study has shown a correlation between the bile-to-plasma ratio of FDA-approved sodium fluorescein (SF) and the degree of hepatic IRI, presumably du… ▽ More During liver transplantation, ischemia-reperfusion injury (IRI) is inevitable and decreases the overall success of the surgery. While guidelines exist, there is no reliable way to quantitatively assess the degree of IRI present in the liver. Our recent study has shown a correlation between the bile-to-plasma ratio of FDA-approved sodium fluorescein (SF) and the degree of hepatic IRI, presumably due to IRI-induced decrease in the activity of the hepatic multidrug resistance-associated protein 2 (MRP2); however, the contribution of SF blood clearance via the bile is still convoluted with other factors, such as renal clearance. In this work, we sought to computationally model SF blood clearance via the bile. First, we converted extant SF fluorescence data from rat whole blood, plasma, and bile to concentrations using calibration curves. Next, based on these SF concentration data, we generated a liver-centric, physiologically-based pharmacokinetic (PBPK) model of SF liver uptake and clearance via the bile. Model simulations show that SF bile concentration is highly sensitive to a change in the activity of hepatic MPR2. These simulations suggest that SF bile clearance along with the PBPK model can be used to quantify the effect of IRI on the activity of MRP2. △ Less

Submitted 10 February, 2023; originally announced February 2023.

Comments: 6 pages, 6 figures, Submitted to IEEE-EMBC Conference Proceedings

arXiv:2301.07248 [pdf]

Deep Learning Enables Reduced Gadolinium Dose for Contrast-Enhanced Blood-Brain Barrier Opening

Authors: P. Lee, H. Wei, A. N. Pouliopoulos, B. T. Forsyth, Y. Yang, C. Zhang, A. F. Laine, E. E. Konofagou, C. Wu, J. Guo

Abstract: Focused ultrasound (FUS) can be used to open the blood-brain barrier (BBB), and MRI with contrast agents can detect that opening. However, repeated use of gadolinium-based contrast agents (GBCAs) presents safety concerns to patients. This study is the first to propose the idea of modeling a volume transfer constant (Ktrans) through deep learning to reduce the dosage of contrast agents. The goal of… ▽ More Focused ultrasound (FUS) can be used to open the blood-brain barrier (BBB), and MRI with contrast agents can detect that opening. However, repeated use of gadolinium-based contrast agents (GBCAs) presents safety concerns to patients. This study is the first to propose the idea of modeling a volume transfer constant (Ktrans) through deep learning to reduce the dosage of contrast agents. The goal of the study is not only to reconstruct artificial intelligence (AI) derived Ktrans images but to also enhance the intensity with low dosage contrast agent T1 weighted MRI scans. We successfully validated this idea through a previous state-of-the-art temporal network algorithm, which focused on extracting time domain features at the voxel level. Then we used a Spatiotemporal Network (ST-Net), composed of a spatiotemporal convolutional neural network (CNN)-based deep learning architecture with the addition of a three-dimensional CNN encoder, to improve the model performance. We tested the ST-Net model on ten datasets of FUS-induced BBB-openings aquired from different sides of the mouse brain. ST-Net successfully detected and enhanced BBB-opening signals without sacrificing spatial domain information. ST-Net was shown to be a promising method of reducing the need of contrast agents for modeling BBB-opening K-trans maps from time-series Dynamic Contrast-Enhanced Magnetic Resonance Imaging (DCE-MRI) scans. △ Less

Submitted 17 January, 2023; originally announced January 2023.

arXiv:2301.02660 [pdf]

Decreased serum vitamin D level as a prognostic marker in patients with COVID-19

Authors: Ruyi Qu, Qiuji Yang, Yingying Bi, Jiajing Cheng, Mengna He, Xin Wei, Yiqi Yuan, Yuxin Yang, Jinlong Qin

Abstract: Background: The corona virus disease 2019 (COVID-19) pandemic, which is caused by severe acute respiratory syndrome coronavirus 2, is still localized outbreak and has resulted in a high rate of infection and severe disease in older patients with comorbidities. The vitamin D status of the population has been found to be an important factor that could influence outcome of COVID-19. However, whether… ▽ More Background: The corona virus disease 2019 (COVID-19) pandemic, which is caused by severe acute respiratory syndrome coronavirus 2, is still localized outbreak and has resulted in a high rate of infection and severe disease in older patients with comorbidities. The vitamin D status of the population has been found to be an important factor that could influence outcome of COVID-19. However, whether vitamin D can lessen the symptoms or severity of COVID-19 still remains controversial. Methods: A total of 719 patients with confirmed COVID-19 were enrolled retrospectively in this study from April 13 to June 6, 2022 at Shanghai Forth People's Hospital. The circulating levels of 25(OH)D3, inflammatory factors, and clinical parameters were assayed. Time to viral RNA clearance (TVRC), classification and prognosis of COVID-19 were used to evaluate the severity of COVID-19 infection. Results: The median age was 76 years (interquartile range, IQR, 64.5-84.6), 44.1% of patients were male, and the TVRC was 11 days (IQR, 7-16) in this population. The median level of 25(OH)D3 was 27.15 (IQR, 19.31-38.89) nmol/L. Patients with lower serum 25(OH)D3 had prolonged time to viral clearance, more obvious inflammatory response, more severe respiratory symptoms and higher risks of impaired hepatic and renal function. Multiple regression analyses revealed that serum 25(OH)D3 level was negatively associated with TVRC independently. ROC curve showed the serum vitamin D level could predict the severity classification and prognosis of COVID-19 significantly.Conclusions: Serum 25(OH)D3 level is independently associated with the severity of COVID-19 in elderly, and it could be used as a predictor of the severity of COVID-19. In addition, supplementation with vitamin D might provide beneficial effects in old patients with COVID-19. △ Less

Submitted 25 December, 2022; originally announced January 2023.

arXiv:2211.12421 [pdf, other]

Data-Driven Network Neuroscience: On Data Collection and Benchmark

Authors: Jiaxing Xu, Yunhan Yang, David Tse Jung Huang, Sophi Shilpa Gururajapathy, Yiping Ke, Miao Qiao, Alan Wang, Haribalan Kumar, Josh McGeown, Eryn Kwon

Abstract: This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such… ▽ More This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such as Alzheimer's, Parkinson's, and Autism. Recently, the study of the brain in the form of brain networks using machine learning and graph analytics has become increasingly popular, especially to predict the early onset of these conditions. A brain network, represented as a graph, retains rich structural and positional information that traditional examination methods are unable to capture. However, the lack of publicly accessible brain network data prevents researchers from data-driven explorations. One of the main difficulties lies in the complicated domain-specific preprocessing steps and the exhaustive computation required to convert the data from MRI images into brain networks. We bridge this gap by collecting a large amount of MRI images from public databases and a private source, working with domain experts to make sensible design choices, and preprocessing the MRI images to produce a collection of brain network datasets. The datasets originate from 6 different sources, cover 4 brain conditions, and consist of a total of 2,702 subjects. We test our graph datasets on 12 machine learning models to provide baselines and validate the data quality on a recent graph analysis model. To lower the barrier to entry and promote the research in this interdisciplinary field, we release our brain network data and complete preprocessing details including codes at https://doi.org/10.17608/k6.auckland.21397377 and https://github.com/brainnetuoa/data_driven_network_neuroscience. △ Less

Submitted 29 October, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

Journal ref: Advances in Neural Information Processing Systems, 2023

arXiv:2211.11750 [pdf, other]

Reconstructing high-order sequence features of dynamic functional connectivity networks based on diversified covert attention patterns for Alzheimer's disease classification

Authors: Zhixiang Zhang, Biao Jie, Zhengdong Wang, Jie Zhou, Yang Yang

Abstract: Recent studies have applied deep learning methods such as convolutional recurrent neural networks (CRNs) and Transformers to brain disease classification based on dynamic functional connectivity networks (dFCNs), such as Alzheimer's disease (AD), achieving better performance than traditional machine learning methods. However, in CRNs, the continuous convolution operations used to obtain high-order… ▽ More Recent studies have applied deep learning methods such as convolutional recurrent neural networks (CRNs) and Transformers to brain disease classification based on dynamic functional connectivity networks (dFCNs), such as Alzheimer's disease (AD), achieving better performance than traditional machine learning methods. However, in CRNs, the continuous convolution operations used to obtain high-order aggregation features may overlook the non-linear correlation between different brain regions due to the essence of convolution being the linear weighted sum of local elements. Inspired by modern neuroscience on the research of covert attention in the nervous system, we introduce the self-attention mechanism, a core module of Transformers, to model diversified covert attention patterns and apply these patterns to reconstruct high-order sequence features of dFCNs in order to learn complex dynamic changes in brain information flow. Therefore, we propose a novel CRN method based on diversified covert attention patterns, DCA-CRN, which combines the advantages of CRNs in capturing local spatio-temporal features and sequence change patterns, as well as Transformers in learning global and high-order correlation features. Experimental results on the ADNI and ADHD-200 datasets demonstrate the prediction performance and generalization ability of our proposed method. △ Less

Submitted 4 September, 2023; v1 submitted 18 November, 2022; originally announced November 2022.

arXiv:2207.09991 [pdf, other]

Causal Models, Prediction, and Extrapolation in Cell Line Perturbation Experiments

Authors: James P. Long, Yumeng Yang, Kim-Anh Do

Abstract: In cell line perturbation experiments, a collection of cells is perturbed with external agents (e.g. drugs) and responses such as protein expression measured. Due to cost constraints, only a small fraction of all possible perturbations can be tested in vitro. This has led to the development of computational (in silico) models which can predict cellular responses to perturbations. Perturbations wit… ▽ More In cell line perturbation experiments, a collection of cells is perturbed with external agents (e.g. drugs) and responses such as protein expression measured. Due to cost constraints, only a small fraction of all possible perturbations can be tested in vitro. This has led to the development of computational (in silico) models which can predict cellular responses to perturbations. Perturbations with clinically interesting predicted responses can be prioritized for in vitro testing. In this work, we compare causal and non-causal regression models for perturbation response prediction in a Melanoma cancer cell line. The current best performing method on this data set is Cellbox which models how proteins causally effect each other using a system of ordinary differential equations (ODEs). We derive a closed form solution to the Cellbox system of ODEs in the linear case. These analytic results facilitate comparison of Cellbox to regression approaches. We show that causal models such as Cellbox, while requiring more assumptions, enable extrapolation in ways that non-causal regression models cannot. For example, causal models can predict responses for never before tested drugs. We illustrate these strengths and weaknesses in simulations. In an application to the Melanoma cell line data, we find that regression models outperform the Cellbox causal model. △ Less

Submitted 20 July, 2022; originally announced July 2022.

Comments: 13 pages, 4 figures

arXiv:2207.03569 [pdf]

Enhanced brain structure-function tethering in transmodal cortex revealed by high-frequency eigenmodes

Authors: Yaqian Yang, Zhiming Zheng, Longzhao Liu, Hongwei Zheng, Yi Zhen, Yi Zheng, Xin Wang, Shaoting Tang

Abstract: The brain's structural connectome supports signal propagation between neuronal elements, shaping diverse coactivation patterns that can be captured as functional connectivity. While the link between structure and function remains an ongoing challenge, the prevailing hypothesis is that the structure-function relationship may itself be gradually decoupled along a macroscale functional gradient spann… ▽ More The brain's structural connectome supports signal propagation between neuronal elements, shaping diverse coactivation patterns that can be captured as functional connectivity. While the link between structure and function remains an ongoing challenge, the prevailing hypothesis is that the structure-function relationship may itself be gradually decoupled along a macroscale functional gradient spanning unimodal to transmodal regions. However, this hypothesis is strongly constrained by the underlying models which may neglect requisite signaling mechanisms. Here, we transform the structural connectome into a set of orthogonal eigenmodes governing frequency-specific diffusion patterns and show that regional structure-function relationships vary markedly under different signaling mechanisms. Specifically, low-frequency eigenmodes, which are considered sufficient to capture the essence of the functional network, contribute little to functional connectivity reconstruction in transmodal regions, resulting in structure-function decoupling along the unimodal-transmodal gradient. In contrast, high-frequency eigenmodes, which are usually on the periphery of attention due to their association with noisy and random dynamical patterns, contribute significantly to functional connectivity prediction in transmodal regions, inducing gradually convergent structure-function relationships from unimodal to transmodal regions. Although the information in high-frequency eigenmodes is weak and scattered, it effectively enhances the structure-function correspondence by 35% in unimodal regions and 56% in transmodal regions. Altogether, our findings suggest that the structure-function divergence in transmodal areas may not be an intrinsic property of brain organization, but can be narrowed through multiplexed and regionally specialized signaling mechanisms. △ Less

Submitted 7 July, 2022; originally announced July 2022.

arXiv:2206.12980 [pdf]

Detecting Schizophrenia with 3D Structural Brain MRI Using Deep Learning

Authors: Junhao Zhang, Vishwanatha M. Rao, Ye Tian, Yanting Yang, Nicolas Acosta, Zihan Wan, Pin-Yu Lee, Chloe Zhang, Lawrence S. Kegeles, Scott A. Small, Jia Guo

Abstract: Schizophrenia is a chronic neuropsychiatric disorder that causes distinct structural alterations within the brain. We hypothesize that deep learning applied to a structural neuroimaging dataset could detect disease-related alteration and improve classification and diagnostic accuracy. We tested this hypothesis using a single, widely available, and conventional T1-weighted MRI scan, from which we e… ▽ More Schizophrenia is a chronic neuropsychiatric disorder that causes distinct structural alterations within the brain. We hypothesize that deep learning applied to a structural neuroimaging dataset could detect disease-related alteration and improve classification and diagnostic accuracy. We tested this hypothesis using a single, widely available, and conventional T1-weighted MRI scan, from which we extracted the 3D whole-brain structure using standard post-processing methods. A deep learning model was then developed, optimized, and evaluated on three open datasets with T1-weighted MRI scans of patients with schizophrenia. Our proposed model outperformed the benchmark model, which was also trained with structural MR images using a 3D CNN architecture. Our model is capable of almost perfectly (area under the ROC curve = 0.987) distinguishing schizophrenia patients from healthy controls on unseen structural MRI scans. Regional analysis localized subcortical regions and ventricles as the most predictive brain regions. Subcortical structures serve a pivotal role in cognitive, affective, and social functions in humans, and structural abnormalities of these regions have been associated with schizophrenia. Our finding corroborates that schizophrenia is associated with widespread alterations in subcortical brain structure and the subcortical structural information provides prominent features in diagnostic classification. Together, these results further demonstrate the potential of deep learning to improve schizophrenia diagnosis and identify its structural neuroimaging signatures from a single, standard T1-weighted brain MRI. △ Less

Submitted 7 July, 2022; v1 submitted 26 June, 2022; originally announced June 2022.

Comments: 13 pages, 6 figures

arXiv:2206.04486 [pdf, other]

doi 10.1145/3534678.3542680

Data-Efficient Brain Connectome Analysis via Multi-Task Meta-Learning

Authors: Yi Yang, Yanqiao Zhu, Hejie Cui, Xuan Kan, Lifang He, Ying Guo, Carl Yang

Abstract: Brain networks characterize complex connectivities among brain regions as graph structures, which provide a powerful means to study brain connectomes. In recent years, graph neural networks have emerged as a prevalent paradigm of learning with structured data. However, most brain network datasets are limited in sample sizes due to the relatively high cost of data acquisition, which hinders the dee… ▽ More Brain networks characterize complex connectivities among brain regions as graph structures, which provide a powerful means to study brain connectomes. In recent years, graph neural networks have emerged as a prevalent paradigm of learning with structured data. However, most brain network datasets are limited in sample sizes due to the relatively high cost of data acquisition, which hinders the deep learning models from sufficient training. Inspired by meta-learning that learns new concepts fast with limited training examples, this paper studies data-efficient training strategies for analyzing brain connectomes in a cross-dataset setting. Specifically, we propose to meta-train the model on datasets of large sample sizes and transfer the knowledge to small datasets. In addition, we also explore two brain-network-oriented designs, including atlas transformation and adaptive task reweighing. Compared to other pre-training strategies, our meta-learning-based approach achieves higher and stabler performance, which demonstrates the effectiveness of our proposed solutions. The framework is also able to derive new insights regarding the similarities among datasets and diseases in a data-driven fashion. △ Less

Submitted 9 June, 2022; originally announced June 2022.

Comments: Accepted to KDD 2022 (Health Day), 9 pages

arXiv:2203.13132 [pdf, other]

DPST: De Novo Peptide Sequencing with Amino-Acid-Aware Transformers

Authors: Yan Yang, Zakir Hossain, Khandaker Asif, Liyuan Pan, Shafin Rahman, Eric Stone

Abstract: De novo peptide sequencing aims to recover amino acid sequences of a peptide from tandem mass spectrometry (MS) data. Existing approaches for de novo analysis enumerate MS evidence for all amino acid classes during inference. It leads to over-trimming on receptive fields of MS data and restricts MS evidence associated with following undecoded amino acids. Our approach, DPST, circumvents these limi… ▽ More De novo peptide sequencing aims to recover amino acid sequences of a peptide from tandem mass spectrometry (MS) data. Existing approaches for de novo analysis enumerate MS evidence for all amino acid classes during inference. It leads to over-trimming on receptive fields of MS data and restricts MS evidence associated with following undecoded amino acids. Our approach, DPST, circumvents these limitations with two key components: (1) A confidence value aggregation encoder to sketch spectrum representations according to amino-acid-based connectivity among MS; (2) A global-local fusion decoder to progressively assimilate contextualized spectrum representations with a predefined preconception of localized MS evidence and amino acid priors. Our components originate from a closed-form solution and selectively attend to informative amino-acid-aware MS representations. Through extensive empirical studies, we demonstrate the superiority of DPST, showing that it outperforms state-of-the-art approaches by a margin of 12% - 19% peptide accuracy. △ Less

Submitted 23 March, 2022; originally announced March 2022.

arXiv:2201.04437 [pdf]

Multi-task Joint Strategies of Self-supervised Representation Learning on Biomedical Networks for Drug Discovery

Authors: Xiaoqi Wang, Yingjie Cheng, Yaning Yang, Yue Yu, Fei Li, Shaoliang Peng

Abstract: Self-supervised representation learning (SSL) on biomedical networks provides new opportunities for drug discovery. However, how to effectively combine multiple SSL models is still challenging and has been rarely explored. Therefore, we propose multi-task joint strategies of self-supervised representation learning on biomedical networks for drug discovery, named MSSL2drug. We design six basic SSL… ▽ More Self-supervised representation learning (SSL) on biomedical networks provides new opportunities for drug discovery. However, how to effectively combine multiple SSL models is still challenging and has been rarely explored. Therefore, we propose multi-task joint strategies of self-supervised representation learning on biomedical networks for drug discovery, named MSSL2drug. We design six basic SSL tasks inspired by various modality features including structures, semantics, and attributes in heterogeneous biomedical networks. Importantly, fifteen combinations of multiple tasks are evaluated by a graph attention-based multi-task adversarial learning framework in two drug discovery scenarios. The results suggest two important findings. (1) Combinations of multimodal tasks achieve the best performance compared to other multi-task joint models. (2) The local-global combination models yield higher performance than random two-task combinations when there are the same size of modalities. Therefore, we conjecture that the multimodal and local-global combination strategies can be treated as the guideline of multi-task SSL for drug discovery. △ Less

Submitted 18 December, 2022; v1 submitted 12 January, 2022; originally announced January 2022.

Comments: 44 pages, 11 figures

arXiv:2109.15089 [pdf, other]

doi 10.3389/fncom.2022.789253

Biologically Plausible Training Mechanisms for Self-Supervised Learning in Deep Networks

Authors: Mufeng Tang, Yibo Yang, Yali Amit

Abstract: We develop biologically plausible training mechanisms for self-supervised learning (SSL) in deep networks. Specifically, by biological plausible training we mean (i) All updates of weights are based on current activities of pre-synaptic units and current, or activity retrieved from short term memory of post synaptic units, including at the top-most error computing layer, (ii) Complex computations… ▽ More We develop biologically plausible training mechanisms for self-supervised learning (SSL) in deep networks. Specifically, by biological plausible training we mean (i) All updates of weights are based on current activities of pre-synaptic units and current, or activity retrieved from short term memory of post synaptic units, including at the top-most error computing layer, (ii) Complex computations such as normalization, inner products and division are avoided (iii) Asymmetric connections between units, (iv) Most learning is carried out in an unsupervised manner. SSL with a contrastive loss satisfies the third condition as it does not require labelled data and it introduces robustness to observed perturbations of objects, which occur naturally as objects or observer move in 3d and with variable lighting over time. We propose a contrastive hinge based loss whose error involves simple local computations satisfying (ii), as opposed to the standard contrastive losses employed in the literature, which do not lend themselves easily to implementation in a network architecture due to complex computations involving ratios and inner products. Furthermore we show that learning can be performed with one of two more plausible alternatives to backpropagation that satisfy conditions (i) and (ii). The first is difference target propagation (DTP) and the second is layer-wise learning (LL), where each layer is directly connected to a layer computing the loss error. Both methods represent alternatives to the symmetric weight issue of backpropagation. By training convolutional neural networks (CNNs) with SSL and DTP, LL, we find that our proposed framework achieves comparable performance to standard BP learning downstream linear classifier evaluation of the learned embeddings. △ Less

Submitted 1 February, 2022; v1 submitted 30 September, 2021; originally announced September 2021.

Comments: To be published in Frontiers in Computational Neuroscience

arXiv:2107.04119 [pdf, ps, other]

Quantitative Evaluation of Explainable Graph Neural Networks for Molecular Property Prediction

Authors: Jiahua Rao, Shuangjia Zheng, Yuedong Yang

Abstract: Advances in machine learning have led to graph neural network-based methods for drug discovery, yielding promising results in molecular design, chemical synthesis planning, and molecular property prediction. However, current graph neural networks (GNNs) remain of limited acceptance in drug discovery is limited due to their lack of interpretability. Although this major weakness has been mitigated b… ▽ More Advances in machine learning have led to graph neural network-based methods for drug discovery, yielding promising results in molecular design, chemical synthesis planning, and molecular property prediction. However, current graph neural networks (GNNs) remain of limited acceptance in drug discovery is limited due to their lack of interpretability. Although this major weakness has been mitigated by the development of explainable artificial intelligence (XAI) techniques, the "ground truth" assignment in most explainable tasks ultimately rests with subjective judgments by humans so that the quality of model interpretation is hard to evaluate in quantity. In this work, we first build three levels of benchmark datasets to quantitatively assess the interpretability of the state-of-the-art GNN models. Then we implemented recent XAI methods in combination with different GNN algorithms to highlight the benefits, limitations, and future opportunities for drug discovery. As a result, GradInput and IG generally provide the best model interpretability for GNNs, especially when combined with GraphNet and CMPNN. The integrated and developed XAI package is fully open-sourced and can be used by practitioners to train new models on other drug discovery tasks. △ Less

Submitted 12 July, 2021; v1 submitted 1 July, 2021; originally announced July 2021.

arXiv:2105.13121 [pdf]

BioNavi-NP: Biosynthesis Navigator for Natural Products

Authors: Shuangjia Zheng, Tao Zeng, Chengtao Li, Binghong Chen, Connor W. Coley, Yuedong Yang, Ruibo Wu

Abstract: Nature, a synthetic master, creates more than 300,000 natural products (NPs) which are the major constituents of FDA-proved drugs owing to the vast chemical space of NPs. To date, there are fewer than 30,000 validated NPs compounds involved in about 33,000 known enzyme catalytic reactions, and even fewer biosynthetic pathways are known with complete cascade-connected enzyme catalysis. Therefore, i… ▽ More Nature, a synthetic master, creates more than 300,000 natural products (NPs) which are the major constituents of FDA-proved drugs owing to the vast chemical space of NPs. To date, there are fewer than 30,000 validated NPs compounds involved in about 33,000 known enzyme catalytic reactions, and even fewer biosynthetic pathways are known with complete cascade-connected enzyme catalysis. Therefore, it is valuable to make computer-aided bio-retrosynthesis predictions. Here, we develop BioNavi-NP, a navigable and user-friendly toolkit, which is capable of predicting the biosynthetic pathways for NPs and NP-like compounds through a novel (AND-OR Tree)-based planning algorithm, an enhanced molecular Transformer neural network, and a training set that combines general organic transformations and biosynthetic steps. Extensive evaluations reveal that BioNavi-NP generalizes well to identifying the reported biosynthetic pathways for 90% of test compounds and recovering the verified building blocks for 73%, significantly outperforming conventional rule-based approaches. Moreover, BioNavi-NP also shows an outstanding capacity of biologically plausible pathways enumeration. In this sense, BioNavi-NP is a leading-edge toolkit to redesign complex biosynthetic pathways of natural products with applications to total or semi-synthesis and pathway elucidation or reconstruction. △ Less

Submitted 26 May, 2021; originally announced May 2021.

Comments: 14 pages

arXiv:2104.11364 [pdf]

A field guide to cultivating computational biology

Authors: Anne E Carpenter, Casey S Greene, Piero Carnici, Benilton S Carvalho, Michiel de Hoon, Stacey Finley, Kim-Anh Le Cao, Jerry SH Lee, Luigi Marchionni, Suzanne Sindi, Fabian J Theis, Gregory P Way, Jean YH Yang, Elana J Fertig

Abstract: Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. This interdisciplina… ▽ More Biomedical research centers can empower basic discovery and novel therapeutic strategies by leveraging their large-scale datasets from experiments and patients. This data, together with new technologies to create and analyze it, has ushered in an era of data-driven discovery which requires moving beyond the traditional individual, single-discipline investigator research model. This interdisciplinary niche is where computational biology thrives. It has matured over the past three decades and made major contributions to scientific knowledge and human health, yet researchers in the field often languish in career advancement, publication, and grant review. We propose solutions for individual scientists, institutions, journal publishers, funding agencies, and educators. △ Less

Submitted 22 April, 2021; originally announced April 2021.

arXiv:2104.04547 [pdf, other]

High-Throughput Virtual Screening of Small Molecule Inhibitors for SARS-CoV-2 Protein Targets with Deep Fusion Models

Authors: Garrett A. Stevenson, Derek Jones, Hyojin Kim, W. F. Drew Bennett, Brian J. Bennion, Monica Borucki, Feliza Bourguet, Aidan Epstein, Magdalena Franco, Brooke Harmon, Stewart He, Max P. Katz, Daniel Kirshner, Victoria Lao, Edmond Y. Lau, Jacky Lo, Kevin McLoughlin, Richard Mosesso, Deepa K. Murugesh, Oscar A. Negrete, Edwin A. Saada, Brent Segelke, Maxwell Stefan, Marisa W. Torres, Dina Weilhammer , et al. (7 additional authors not shown)

Abstract: Structure-based Deep Fusion models were recently shown to outperform several physics- and machine learning-based protein-ligand binding affinity prediction methods. As part of a multi-institutional COVID-19 pandemic response, over 500 million small molecules were computationally screened against four protein structures from the novel coronavirus (SARS-CoV-2), which causes COVID-19. Three enhanceme… ▽ More Structure-based Deep Fusion models were recently shown to outperform several physics- and machine learning-based protein-ligand binding affinity prediction methods. As part of a multi-institutional COVID-19 pandemic response, over 500 million small molecules were computationally screened against four protein structures from the novel coronavirus (SARS-CoV-2), which causes COVID-19. Three enhancements to Deep Fusion were made in order to evaluate more than 5 billion docked poses on SARS-CoV-2 protein targets. First, the Deep Fusion concept was refined by formulating the architecture as one, coherently backpropagated model (Coherent Fusion) to improve binding-affinity prediction accuracy. Secondly, the model was trained using a distributed, genetic hyper-parameter optimization. Finally, a scalable, high-throughput screening capability was developed to maximize the number of ligands evaluated and expedite the path to experimental evaluation. In this work, we present both the methods developed for machine learning-based high-throughput screening and results from using our computational pipeline to find SARS-CoV-2 inhibitors. △ Less

Submitted 31 May, 2021; v1 submitted 9 April, 2021; originally announced April 2021.

arXiv:2103.10432 [pdf, other]

MARS: Markov Molecular Sampling for Multi-objective Drug Discovery

Authors: Yutong Xie, Chence Shi, Hao Zhou, Yuwei Yang, Weinan Zhang, Yong Yu, Lei Li

Abstract: Searching for novel molecules with desired chemical properties is crucial in drug discovery. Existing work focuses on developing neural models to generate either molecular sequences or chemical graphs. However, it remains a big challenge to find novel and diverse compounds satisfying several properties. In this paper, we propose MARS, a method for multi-objective drug molecule discovery. MARS is b… ▽ More Searching for novel molecules with desired chemical properties is crucial in drug discovery. Existing work focuses on developing neural models to generate either molecular sequences or chemical graphs. However, it remains a big challenge to find novel and diverse compounds satisfying several properties. In this paper, we propose MARS, a method for multi-objective drug molecule discovery. MARS is based on the idea of generating the chemical candidates by iteratively editing fragments of molecular graphs. To search for high-quality candidates, it employs Markov chain Monte Carlo sampling (MCMC) on molecules with an annealing scheme and an adaptive proposal. To further improve sample efficiency, MARS uses a graph neural network (GNN) to represent and select candidate edits, where the GNN is trained on-the-fly with samples from MCMC. Experiments show that MARS achieves state-of-the-art performance in various multi-objective settings where molecular bio-activity, drug-likeness, and synthesizability are considered. Remarkably, in the most challenging setting where all four objectives are simultaneously optimized, our approach outperforms previous methods significantly in comprehensive evaluations. The code is available at https://github.com/yutxie/mars. △ Less

Submitted 18 March, 2021; originally announced March 2021.

Comments: ICLR 2021

arXiv:2103.06578 [pdf]

doi 10.1039/D1NR01672E

Quantitative Interpretations of Energetic Features and Key Residues at SARS Coronavirus Spike Receptor-Binding Domain and ACE2 Receptor Interface

Authors: Yanmei Yang, Yunju Zhang, Yuanyuan Qu, Xuewei Liu, Mingwen Zhao, Yuguang Mu, Weifeng Li

Abstract: The wide spread of coronavirus disease 2019 (COVID-19) has declared a global health emergency. As one of the most important targets for antibody and drug developments, Spike RBD-ACE2 interface has received extensive attention. Here, using molecular dynamics simulations, we explicitly evaluated the binding energetic features of the RBD-ACE2 complex of both SARS-CoV and SARS-CoV-2 to find the key re… ▽ More The wide spread of coronavirus disease 2019 (COVID-19) has declared a global health emergency. As one of the most important targets for antibody and drug developments, Spike RBD-ACE2 interface has received extensive attention. Here, using molecular dynamics simulations, we explicitly evaluated the binding energetic features of the RBD-ACE2 complex of both SARS-CoV and SARS-CoV-2 to find the key residues. Although the overall ACE2-binding mode of the SARS-CoV-2 RBD is nearly identical to that of the SARS-CoV RBD, the difference in binding affinity is as large as -16.35 kcal/mol. Energy decomposition analyses identified three binding patches in the SARS-CoV-2 RBD and eleven key residues (Phe486, Tyr505, Asn501, Tyr489, Gln493, Leu455 and etc) which are believed to be the main targets for drug development. The dominating forces are from van der Waals attractions and dehydration of these residues. It is also worth mention that we found seven mutational sites (Lys417, Leu455, Ala475, Gly476, Glu484, Gln498 and Val503) on SARS-CoV-2 which unexpectedly weakened the RBD-ACE2 binding. Very interestingly, the most repulsive residue at the RBD-ACE2 interface (E484), is found to be mutated in the latest UK variant, B1.1.7, cause complete virus neutralization escapes from highly neutralizing COVID-19 convalescent plasma. Our present results indicate that at least from the energetic point of view such E484 mutation may have beneficial effects on ACE2 binding. The present study provides a systematical understanding, from the energetic point of view, of the binding features of SARS-CoV-2 RBD with ACE2 acceptor. We hope that the present findings of three binding patches, key attracting residues and unexpected mutational sites can provide insights to the design of SARS-CoV-2 drugs and identification of cross-active antibodies. △ Less

Submitted 11 March, 2021; originally announced March 2021.

Comments: 12 pages, 4 figures, 1 table

MSC Class: 92C05; 92C45; 92C50

Journal ref: Nanoscale, 2021

arXiv:2102.10471 [pdf, other]

Multi-Phase Locking Value: A Generalized Method for Determining Instantaneous Multi-frequency Phase Coupling

Authors: Bhavya Vasudeva, Runfeng Tian, Dee H. Wu, Shirley A. James, Hazem H. Refai, Fei He, Yuan Yang

Abstract: Many physical, biological and neural systems behave as coupled oscillators, with characteristic phase coupling across different frequencies. Methods such as $n:m$ phase locking value and bi-phase locking value have previously been proposed to quantify phase coupling between two resonant frequencies (e.g. $f$, $2f/3$) and across three frequencies (e.g. $f_1$, $f_2$, $f_1+f_2$), respectively. Howeve… ▽ More Many physical, biological and neural systems behave as coupled oscillators, with characteristic phase coupling across different frequencies. Methods such as $n:m$ phase locking value and bi-phase locking value have previously been proposed to quantify phase coupling between two resonant frequencies (e.g. $f$, $2f/3$) and across three frequencies (e.g. $f_1$, $f_2$, $f_1+f_2$), respectively. However, the existing phase coupling metrics have their limitations and limited applications. They cannot be used to detect or quantify phase coupling across multiple frequencies (e.g. $f_1$, $f_2$, $f_3$, $f_4$, $f_1+f_2+f_3-f_4$), or coupling that involves non-integer multiples of the frequencies (e.g. $f_1$, $f_2$, $2f_1/3+f_2/3$). To address the gap, this paper proposes a generalized approach, named multi-phase locking value (M-PLV), for the quantification of various types of instantaneous multi-frequency phase coupling. Different from most instantaneous phase coupling metrics that measure the simultaneous phase coupling, the proposed M-PLV method also allows the detection of delayed phase coupling and the associated time lag between coupled oscillators. The M-PLV has been tested on cases where synthetic coupled signals are generated using white Gaussian signals, and a system comprised of multiple coupled Rössler oscillators. Results indicate that the M-PLV can provide a reliable estimation of the time window and frequency combination where the phase coupling is significant, as well as a precise determination of time lag in the case of delayed coupling. This method has the potential to become a powerful new tool for exploring phase coupling in complex nonlinear dynamic systems. △ Less

Submitted 2 January, 2022; v1 submitted 20 February, 2021; originally announced February 2021.

Comments: 9 pages, 8 figures

Showing 1–50 of 91 results for author: Yang, Y