subscribe to arXiv mailings

Wireless Spectrum in Rural Farmlands: Status, Challenges and Opportunities

Authors: Mukaram Shahid, Kunal Das, Taimoor Ul Islam, Christ Somiah, Daji Qiao, Arsalan Ahmad, Jimming Song, Zhengyuan Zhu, Sarath Babu, Yong Guan, Tusher Chakraborty, Suraj Jog, Ranveer Chandra, Hongwei Zhang

Abstract: Due to factors such as low population density and expansive geographical distances, network deployment falls behind in rural regions, leading to a broadband divide. Wireless spectrum serves as the blood and flesh of wireless communications. Shared white spaces such as those in the TVWS and CBRS spectrum bands offer opportunities to expand connectivity, innovate, and provide affordable access to hi… ▽ More Due to factors such as low population density and expansive geographical distances, network deployment falls behind in rural regions, leading to a broadband divide. Wireless spectrum serves as the blood and flesh of wireless communications. Shared white spaces such as those in the TVWS and CBRS spectrum bands offer opportunities to expand connectivity, innovate, and provide affordable access to high-speed Internet in under-served areas without additional cost to expensive licensed spectrum. However, the current methods to utilize these white spaces are inefficient due to very conservative models and spectrum policies, causing under-utilization of valuable spectrum resources. This hampers the full potential of innovative wireless technologies that could benefit farmers, small Internet Service Providers (ISPs) or Mobile Network Operators (MNOs) operating in rural regions. This study explores the challenges faced by farmers and service providers when using shared spectrum bands to deploy their networks while ensuring maximum system performance and minimizing interference with other users. Additionally, we discuss how spatiotemporal spectrum models, in conjunction with database-driven spectrum-sharing solutions, can enhance the allocation and management of spectrum resources, ultimately improving the efficiency and reliability of wireless networks operating in shared spectrum bands. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.03661 [pdf, other]

Configurable DOA Estimation using Incremental Learning

Authors: Yang Xiao, Rohan Kumar Das

Abstract: This study introduces a progressive neural network (PNN) model for direction of arrival (DOA) estimation, DOA-PNN, addressing the challenge due to catastrophic forgetting in adapting dynamic acoustic environments. While traditional methods such as GCC, MUSIC, and SRP-PHAT are effective in static settings, they perform worse in noisy, reverberant conditions. Deep learning models, particularly CNNs,… ▽ More This study introduces a progressive neural network (PNN) model for direction of arrival (DOA) estimation, DOA-PNN, addressing the challenge due to catastrophic forgetting in adapting dynamic acoustic environments. While traditional methods such as GCC, MUSIC, and SRP-PHAT are effective in static settings, they perform worse in noisy, reverberant conditions. Deep learning models, particularly CNNs, offer improvements but struggle with a mismatch configuration between the training and inference phases. The proposed DOA-PNN overcomes these limitations by incorporating task incremental learning of continual learning, allowing for adaptation across varying acoustic scenarios with less forgetting of previously learned knowledge. Featuring task-specific sub-networks and a scaling mechanism, DOA-PNN efficiently manages parameter growth, ensuring high performance across incremental microphone configurations. We study DOA-PNN on a simulated data under various mic distance based microphone settings. The studies reveal its capability to maintain performance with minimal parameter increase, presenting an efficient solution for DOA estimation. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to DCASE WS 2024

arXiv:2407.03657 [pdf, other]

UCIL: An Unsupervised Class Incremental Learning Approach for Sound Event Detection

Authors: Yang Xiao, Rohan Kumar Das

Abstract: This work explores class-incremental learning (CIL) for sound event detection (SED), advancing adaptability towards real-world scenarios. CIL's success in domains like computer vision inspired our SED-tailored method, addressing the unique challenges of diverse and complex audio environments. Our approach employs an independent unsupervised learning framework with a distillation loss function to i… ▽ More This work explores class-incremental learning (CIL) for sound event detection (SED), advancing adaptability towards real-world scenarios. CIL's success in domains like computer vision inspired our SED-tailored method, addressing the unique challenges of diverse and complex audio environments. Our approach employs an independent unsupervised learning framework with a distillation loss function to integrate new sound classes while preserving the SED model consistency across incremental tasks. We further enhance this framework with a sample selection strategy for unlabeled data and a balanced exemplar update mechanism, ensuring varied and illustrative sound representations. Evaluating various continual learning methods on the DCASE 2023 Task 4 dataset, we find that our research offers insights into each method's applicability for real-world SED systems that can have newly added sound classes. The findings also delineate future directions of CIL in dynamic audio settings. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to DCASE WS 2024

arXiv:2407.03656 [pdf, other]

WildDESED: An LLM-Powered Dataset for Wild Domestic Environment Sound Event Detection System

Authors: Yang Xiao, Rohan Kumar Das

Abstract: This work aims to advance sound event detection (SED) research by presenting a new large language model (LLM)-powered dataset namely wild domestic environment sound event detection (WildDESED). It is crafted as an extension to the original DESED dataset to reflect diverse acoustic variability and complex noises in home settings. We leveraged LLMs to generate eight different domestic scenarios base… ▽ More This work aims to advance sound event detection (SED) research by presenting a new large language model (LLM)-powered dataset namely wild domestic environment sound event detection (WildDESED). It is crafted as an extension to the original DESED dataset to reflect diverse acoustic variability and complex noises in home settings. We leveraged LLMs to generate eight different domestic scenarios based on target sound categories of the DESED dataset. Then we enriched the scenarios with a carefully tailored mixture of noises selected from AudioSet and ensured no overlap with target sound. We consider widely popular convolutional neural recurrent network to study WildDESED dataset, which depicts its challenging nature. We then apply curriculum learning by gradually increasing noise complexity to enhance the model's generalization capabilities across various noise levels. Our results with this approach show improvements within the noisy environment, validating the effectiveness on the WildDESED dataset promoting noise-robust SED advancements. △ Less

Submitted 4 July, 2024; originally announced July 2024.

Comments: Submitted to DCASE WS 2024

arXiv:2407.00291 [pdf, other]

FMSG-JLESS Submission for DCASE 2024 Task4 on Sound Event Detection with Heterogeneous Training Dataset and Potentially Missing Labels

Authors: Yang Xiao, Han Yin, Jisheng Bai, Rohan Kumar Das

Abstract: This report presents the systems developed and submitted by Fortemedia Singapore (FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 Task 4. The task focuses on recognizing event classes and their time boundaries, given that multiple events can be present and may overlap in an audio recording. The novelty this year is a dataset with two sources, making it challenging… ▽ More This report presents the systems developed and submitted by Fortemedia Singapore (FMSG) and Joint Laboratory of Environmental Sound Sensing (JLESS) for DCASE 2024 Task 4. The task focuses on recognizing event classes and their time boundaries, given that multiple events can be present and may overlap in an audio recording. The novelty this year is a dataset with two sources, making it challenging to achieve good performance without knowing the source of the audio clips during evaluation. To address this, we propose a sound event detection method using domain generalization. Our approach integrates features from bidirectional encoder representations from audio transformers and a convolutional recurrent neural network. We focus on three main strategies to improve our method. First, we apply mixstyle to the frequency dimension to adapt the mel-spectrograms from different domains. Second, we consider training loss of our model specific to each datasets for their corresponding classes. This independent learning framework helps the model extract domain-specific features effectively. Lastly, we use the sound event bounding boxes method for post-processing. Our proposed method shows superior macro-average pAUC and polyphonic SED score performance on the DCASE 2024 Challenge Task 4 validation dataset and public evaluation dataset. △ Less

Submitted 28 June, 2024; originally announced July 2024.

Comments: Technical report for DCASE 2024 Challenge Task 4

arXiv:2406.17574 [pdf, other]

Beyond Text-to-SQL for IoT Defense: A Comprehensive Framework for Querying and Classifying IoT Threats

Authors: Ryan Pavlich, Nima Ebadi, Richard Tarbell, Billy Linares, Adrian Tan, Rachael Humphreys, Jayanta Kumar Das, Rambod Ghandiparsi, Hannah Haley, Jerris George, Rocky Slavin, Kim-Kwang Raymond Choo, Glenn Dietrich, Anthony Rios

Abstract: Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major co… ▽ More Recognizing the promise of natural language interfaces to databases, prior studies have emphasized the development of text-to-SQL systems. While substantial progress has been made in this field, existing research has concentrated on generating SQL statements from text queries. The broader challenge, however, lies in inferring new information about the returned data. Our research makes two major contributions to address this gap. First, we introduce a novel Internet-of-Things (IoT) text-to-SQL dataset comprising 10,985 text-SQL pairs and 239,398 rows of network traffic activity. The dataset contains additional query types limited in prior text-to-SQL datasets, notably temporal-related queries. Our dataset is sourced from a smart building's IoT ecosystem exploring sensor read and network traffic data. Second, our dataset allows two-stage processing, where the returned data (network traffic) from a generated SQL can be categorized as malicious or not. Our results show that joint training to query and infer information about the data can improve overall text-to-SQL performance, nearly matching substantially larger models. We also show that current large language models (e.g., GPT3.5) struggle to infer new information about returned data, thus our dataset provides a novel test bed for integrating complex domain-specific reasoning into LLMs. △ Less

Submitted 25 June, 2024; originally announced June 2024.

arXiv:2406.02483 [pdf, other]

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?

Authors: Tianchi Liu, Lin Zhang, Rohan Kumar Das, Yi Ma, Ruijie Tao, Haizhou Li

Abstract: Partially manipulating a sentence can greatly change its meaning. Recent work shows that countermeasures (CMs) trained on partially spoofed audio can effectively detect such spoofing. However, the current understanding of the decision-making process of CMs is limited. We utilize Grad-CAM and introduce a quantitative analysis metric to interpret CMs' decisions. We find that CMs prioritize the artif… ▽ More Partially manipulating a sentence can greatly change its meaning. Recent work shows that countermeasures (CMs) trained on partially spoofed audio can effectively detect such spoofing. However, the current understanding of the decision-making process of CMs is limited. We utilize Grad-CAM and introduce a quantitative analysis metric to interpret CMs' decisions. We find that CMs prioritize the artifacts of transition regions created when concatenating bona fide and spoofed audio. This focus differs from that of CMs trained on fully spoofed audio, which concentrate on the pattern differences between bona fide and spoofed parts. Our further investigation explains the varying nature of CMs' focus while making correct or incorrect predictions. These insights provide a basis for the design of CM models and the creation of datasets. Moreover, this work lays a foundation of interpretability in the field of partial spoofed audio detection that has not been well explored previously. △ Less

Submitted 4 June, 2024; originally announced June 2024.

Comments: Accepted at Interspeech 2024

arXiv:2405.01156 [pdf, other]

Self-Supervised Learning for Interventional Image Analytics: Towards Robust Device Trackers

Authors: Saahil Islam, Venkatesh N. Murthy, Dominik Neumann, Badhan Kumar Das, Puneet Sharma, Andreas Maier, Dorin Comaniciu, Florin C. Ghesu

Abstract: An accurate detection and tracking of devices such as guiding catheters in live X-ray image acquisitions is an essential prerequisite for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness no failures during tracking. To achieve that, one needs to… ▽ More An accurate detection and tracking of devices such as guiding catheters in live X-ray image acquisitions is an essential prerequisite for endovascular cardiac interventions. This information is leveraged for procedural guidance, e.g., directing stent placements. To ensure procedural safety and efficacy, there is a need for high robustness no failures during tracking. To achieve that, one needs to efficiently tackle challenges, such as: device obscuration by contrast agent or other external devices or wires, changes in field-of-view or acquisition angle, as well as the continuous movement due to cardiac and respiratory motion. To overcome the aforementioned challenges, we propose a novel approach to learn spatio-temporal features from a very large data cohort of over 16 million interventional X-ray frames using self-supervision for image sequence data. Our approach is based on a masked image modeling technique that leverages frame interpolation based reconstruction to learn fine inter-frame temporal correspondences. The features encoded in the resulting model are fine-tuned downstream. Our approach achieves state-of-the-art performance and in particular robustness compared to ultra optimized reference solutions (that use multi-stage feature fusion, multi-task and flow regularization). The experiments show that our method achieves 66.31% reduction in maximum tracking error against reference solutions (23.20% when flow regularization is used); achieving a success score of 97.95% at a 3x faster inference speed of 42 frames-per-second (on GPU). The results encourage the use of our approach in various other tasks within interventional image analytics that require effective understanding of spatio-temporal semantics. △ Less

Submitted 2 May, 2024; originally announced May 2024.

arXiv:2404.17280 [pdf, other]

Device Feature based on Graph Fourier Transformation with Logarithmic Processing For Detection of Replay Speech Attacks

Authors: Mingrui He, Longting Xu, Han Wang, Mingjun Zhang, Rohan Kumar Das

Abstract: The most common spoofing attacks on automatic speaker verification systems are replay speech attacks. Detection of replay speech heavily relies on replay configuration information. Previous studies have shown that graph Fourier transform-derived features can effectively detect replay speech but ignore device and environmental noise effects. In this work, we propose a new feature, the graph frequen… ▽ More The most common spoofing attacks on automatic speaker verification systems are replay speech attacks. Detection of replay speech heavily relies on replay configuration information. Previous studies have shown that graph Fourier transform-derived features can effectively detect replay speech but ignore device and environmental noise effects. In this work, we propose a new feature, the graph frequency device cepstral coefficient, derived from the graph frequency domain using a device-related linear transformation. We also introduce two novel representations: graph frequency logarithmic coefficient and graph frequency logarithmic device coefficient. We evaluate our methods using traditional Gaussian mixture model and light convolutional neural network systems as classifiers. On the ASVspoof 2017 V2, ASVspoof 2019 physical access, and ASVspoof 2021 physical access datasets, our proposed features outperform known front-ends, demonstrating their effectiveness for replay speech detection. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.17279 [pdf, ps, other]

Bipartite powers of some classes of bipartite graphs

Authors: Indrajit Paul, Ashok Kumar Das

Abstract: Graph powers are a well-studied concept in graph theory. Analogous to graph powers, Chandran et al.[3] introduced the concept of bipartite powers for bipartite graphs. In this paper, we will demonstrate that some well-known classes of bipartite graphs, namely the interval bigraphs, proper interval bigraphs, and bigraphs of Ferrers dimension 2, are closed under the operation of taking bipartite pow… ▽ More Graph powers are a well-studied concept in graph theory. Analogous to graph powers, Chandran et al.[3] introduced the concept of bipartite powers for bipartite graphs. In this paper, we will demonstrate that some well-known classes of bipartite graphs, namely the interval bigraphs, proper interval bigraphs, and bigraphs of Ferrers dimension 2, are closed under the operation of taking bipartite powers. Finally, we define strongly closed property for bipartite graphs under powers and have shown that the class of chordal bipartite graphs is strongly closed under powers. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.09342 [pdf, other]

Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

Authors: Muhammad Saad Saeed, Shah Nawaz, Muhammad Salman Tahir, Rohan Kumar Das, Muhammad Zaigham Zaheer, Marta Moscati, Markus Schedl, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf

Abstract: The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2… ▽ More The advancements of technology have led to the use of multimodal systems in various real-world applications. Among them, the audio-visual systems are one of the widely used multimodal systems. In the recent years, associating face and voice of a person has gained attention due to presence of unique correlation between them. The Face-voice Association in Multilingual Environments (FAME) Challenge 2024 focuses on exploring face-voice association under a unique condition of multilingual scenario. This condition is inspired from the fact that half of the world's population is bilingual and most often people communicate under multilingual scenario. The challenge uses a dataset namely, Multilingual Audio-Visual (MAV-Celeb) for exploring face-voice association in multilingual environments. This report provides the details of the challenge, dataset, baselines and task details for the FAME Challenge. △ Less

Submitted 16 April, 2024; v1 submitted 14 April, 2024; originally announced April 2024.

Comments: ACM Multimedia Conference - Grand Challenge

arXiv:2404.03511 [pdf, other]

Improved Total Domination and Total Roman Domination in Unit Disk Graphs

Authors: Sasmita Rout, Gautam Kumar Das

Abstract: Let $G=(V, E)$ be a simple undirected graph with no isolated vertex. A set $D_t\subseteq V$ is a total dominating set of $G$ if $(i)$ $D_t$ is a dominating set, and $(ii)$ the set $D_t$ induces a subgraph with no isolated vertex. The total dominating set of minimum cardinality is called the minimum total dominating set, and the size of the minimum total dominating set is called the total dominatio… ▽ More Let $G=(V, E)$ be a simple undirected graph with no isolated vertex. A set $D_t\subseteq V$ is a total dominating set of $G$ if $(i)$ $D_t$ is a dominating set, and $(ii)$ the set $D_t$ induces a subgraph with no isolated vertex. The total dominating set of minimum cardinality is called the minimum total dominating set, and the size of the minimum total dominating set is called the total domination number ($γ_t(G)$). Given a graph $G$, the total dominating set (TDS) problem is to find a total dominating set of minimum cardinality. A Roman dominating function (RDF) on a graph $G$ is a function $f:V\rightarrow \{0,1,2\}$ such that each vertex $v\in V$ with $f(v)=0$ is adjacent to at least one vertex $u\in V$ with $f(u)=2$. A RDF $f$ of a graph $G$ is said to be a total Roman dominating function (TRDF) if the induced subgraph of $V_1\cup V_2$ does not contain any isolated vertex, where $V_i=\{u\in V|f(u)=i\}$. Given a graph $G$, the total Roman dominating set (TRDS) problem is to minimize the weight, $W(f)=\sum_{u\in V} f(u)$, called the total Roman domination number ($γ_{tR}(G)$). In this paper, we are the first to show that the TRDS problem is NP-complete in unit disk graphs (UDGs). Furthermore, we propose a $7.17\operatorname{-}$ factor approximation algorithm for the TDS problem and a $6.03\operatorname{-}$ factor approximation algorithm for the TRDS problem in geometric unit disk graphs. The running time for both algorithms is notably bounded by $O(n\log{k})$, where $n$ represents the number of vertices in the given UDG and $k$ represents the size of the independent set in (i.e., $D$ and $V_2$ in TDS and TRDS problems, respectively) the given UDG. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2403.19831 [pdf, other]

TASR: A Novel Trust-Aware Stackelberg Routing Algorithm to Mitigate Traffic Congestion

Authors: Doris E. M. Brown, Venkata Sriram Siddhardh Nadendla, Sajal K. Das

Abstract: Stackelberg routing platforms (SRP) reduce congestion in one-shot traffic networks by proposing optimal route recommendations to selfish travelers. Traditionally, Stackelberg routing is cast as a partial control problem where a fraction of traveler flow complies with route recommendations, while the remaining respond as selfish travelers. In this paper, a novel Stackelberg routing framework is for… ▽ More Stackelberg routing platforms (SRP) reduce congestion in one-shot traffic networks by proposing optimal route recommendations to selfish travelers. Traditionally, Stackelberg routing is cast as a partial control problem where a fraction of traveler flow complies with route recommendations, while the remaining respond as selfish travelers. In this paper, a novel Stackelberg routing framework is formulated where the agents exhibit \emph{probabilistic compliance} by accepting SRP's route recommendations with a \emph{trust} probability. A greedy \emph{\textbf{T}rust-\textbf{A}ware \textbf{S}tackelberg \textbf{R}outing} algorithm (in short, TASR) is proposed for SRP to compute unique path recommendations to each traveler flow with a unique demand. Simulation experiments are designed with random travel demands with diverse trust values on real road networks such as Sioux Falls, Chicago Sketch, and Sydney networks for both single-commodity and multi-commodity flows. The performance of TASR is compared with state-of-the-art Stackelberg routing methods in terms of traffic congestion and trust dynamics over repeated interaction between the SRP and the travelers. Results show that TASR improves network congestion without causing a significant reduction in trust towards the SRP, when compared to most well-known Stackelberg routing strategies. △ Less

Submitted 28 March, 2024; originally announced March 2024.

arXiv:2403.08613 [pdf, other]

Link Prediction for Social Networks using Representation Learning and Heuristic-based Features

Authors: Samarth Khanna, Sree Bhattacharyya, Sudipto Ghosh, Kushagra Agarwal, Asit Kumar Das

Abstract: The exponential growth in scale and relevance of social networks enable them to provide expansive insights. Predicting missing links in social networks efficiently can help in various modern-day business applications ranging from generating recommendations to influence analysis. Several categories of solutions exist for the same. Here, we explore various feature extraction techniques to generate r… ▽ More The exponential growth in scale and relevance of social networks enable them to provide expansive insights. Predicting missing links in social networks efficiently can help in various modern-day business applications ranging from generating recommendations to influence analysis. Several categories of solutions exist for the same. Here, we explore various feature extraction techniques to generate representations of nodes and edges in a social network that allow us to predict missing links. We compare the results of using ten feature extraction techniques categorized across Structural embeddings, Neighborhood-based embeddings, Graph Neural Networks, and Graph Heuristics, followed by modeling with ensemble classifiers and custom Neural Networks. Further, we propose combining heuristic-based features and learned representations that demonstrate improved performance for the link prediction task on social network datasets. Using this method to generate accurate recommendations for many applications is a matter of further study that appears very promising. The code for all the experiments has been made public. △ Less

Submitted 13 March, 2024; originally announced March 2024.

Comments: Accepted to the MAISoN Workshop at IJCAI 2023

arXiv:2403.07025 [pdf, other]

Enhancing Quantum Variational Algorithms with Zero Noise Extrapolation via Neural Networks

Authors: Subhasree Bhattacharjee, Soumyadip Sarkar, Kunal Das, Bikramjit Sarkar

Abstract: In the emergent realm of quantum computing, the Variational Quantum Eigensolver (VQE) stands out as a promising algorithm for solving complex quantum problems, especially in the noisy intermediate-scale quantum (NISQ) era. However, the ubiquitous presence of noise in quantum devices often limits the accuracy and reliability of VQE outcomes. This research introduces a novel approach to ameliorate t… ▽ More In the emergent realm of quantum computing, the Variational Quantum Eigensolver (VQE) stands out as a promising algorithm for solving complex quantum problems, especially in the noisy intermediate-scale quantum (NISQ) era. However, the ubiquitous presence of noise in quantum devices often limits the accuracy and reliability of VQE outcomes. This research introduces a novel approach to ameliorate this challenge by utilizing neural networks for zero noise extrapolation (ZNE) in VQE computations. By employing the Qiskit framework, we crafted parameterized quantum circuits using the RY-RZ ansatz and examined their behavior under varying levels of depolarizing noise. Our investigations spanned from determining the expectation values of a Hamiltonian, defined as a tensor product of Z operators, under different noise intensities to extracting the ground state energy. To bridge the observed outcomes under noise with the ideal noise-free scenario, we trained a Feed Forward Neural Network on the error probabilities and their associated expectation values. Remarkably, our model proficiently predicted the VQE outcome under hypothetical noise-free conditions. By juxtaposing the simulation results with real quantum device executions, we unveiled the discrepancies induced by noise and showcased the efficacy of our neural network-based ZNE technique in rectifying them. This integrative approach not only paves the way for enhanced accuracy in VQE computations on NISQ devices but also underlines the immense potential of hybrid quantum-classical paradigms in circumventing the challenges posed by quantum noise. Through this research, we envision a future where quantum algorithms can be reliably executed on noisy devices, bringing us one step closer to realizing the full potential of quantum computing. △ Less

Submitted 10 March, 2024; originally announced March 2024.

arXiv:2403.02509 [pdf, other]

SPUQ: Perturbation-Based Uncertainty Quantification for Large Language Models

Authors: Xiang Gao, Jiaxin Zhang, Lalla Mouatadid, Kamalika Das

Abstract: In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncert… ▽ More In recent years, large language models (LLMs) have become increasingly prevalent, offering remarkable text generation capabilities. However, a pressing challenge is their tendency to make confidently wrong predictions, highlighting the critical need for uncertainty quantification (UQ) in LLMs. While previous works have mainly focused on addressing aleatoric uncertainty, the full spectrum of uncertainties, including epistemic, remains inadequately explored. Motivated by this gap, we introduce a novel UQ method, sampling with perturbation for UQ (SPUQ), designed to tackle both aleatoric and epistemic uncertainties. The method entails generating a set of perturbations for LLM inputs, sampling outputs for each perturbation, and incorporating an aggregation module that generalizes the sampling uncertainty approach for text generation tasks. Through extensive experiments on various datasets, we investigated different perturbation and aggregation techniques. Our findings show a substantial improvement in model uncertainty calibration, with a reduction in Expected Calibration Error (ECE) by 50\% on average. Our findings suggest that our proposed UQ method offers promising steps toward enhancing the reliability and trustworthiness of LLMs. △ Less

Submitted 4 March, 2024; originally announced March 2024.

Comments: Accepted to appear at EACL 2024

arXiv:2402.12664 [pdf, other]

Discriminant Distance-Aware Representation on Deterministic Uncertainty Quantification Methods

Authors: Jiaxin Zhang, Kamalika Das, Sricharan Kumar

Abstract: Uncertainty estimation is a crucial aspect of deploying dependable deep learning models in safety-critical systems. In this study, we introduce a novel and efficient method for deterministic uncertainty estimation called Discriminant Distance-Awareness Representation (DDAR). Our approach involves constructing a DNN model that incorporates a set of prototypes in its latent representations, enabling… ▽ More Uncertainty estimation is a crucial aspect of deploying dependable deep learning models in safety-critical systems. In this study, we introduce a novel and efficient method for deterministic uncertainty estimation called Discriminant Distance-Awareness Representation (DDAR). Our approach involves constructing a DNN model that incorporates a set of prototypes in its latent representations, enabling us to analyze valuable feature information from the input data. By leveraging a distinction maximization layer over optimal trainable prototypes, DDAR can learn a discriminant distance-awareness representation. We demonstrate that DDAR overcomes feature collapse by relaxing the Lipschitz constraint that hinders the practicality of deterministic uncertainty methods (DUMs) architectures. Our experiments show that DDAR is a flexible and architecture-agnostic method that can be easily integrated as a pluggable layer with distance-sensitive metrics, outperforming state-of-the-art uncertainty estimation methods on multiple benchmark problems. △ Less

Submitted 19 February, 2024; originally announced February 2024.

Comments: AISTATS 2024

arXiv:2402.11490 [pdf, other]

Research status of the Mendeleev Periodic Table: a bibliometric analysis

Authors: Kamna Sharma, Deepak Kumar Das, Saibal Ray

Abstract: In this paper, we present a bibliometric analysis of the Mendeleev Periodic Table. We have conducted a comprehensive analysis of the Scopus-based database using the keyword "Mendeleev Periodic Table". Our findings suggest that the Mendeleev Periodic Table is an influential topic in the field of Inorganic as well as Organic Chemistry. Future researchers may focus on expanding our analysis to includ… ▽ More In this paper, we present a bibliometric analysis of the Mendeleev Periodic Table. We have conducted a comprehensive analysis of the Scopus-based database using the keyword "Mendeleev Periodic Table". Our findings suggest that the Mendeleev Periodic Table is an influential topic in the field of Inorganic as well as Organic Chemistry. Future researchers may focus on expanding our analysis to include other bibliometric indicators to gain a more comprehensive understanding of the impact of the Mendeleev Periodic Table in chemistry-based scientific investigations and even in the field of astrochemistry. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: 16 pages, 9 figures

ACM Class: F.2.2; I.2.7

arXiv:2402.11347 [pdf, other]

PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models

Authors: Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley Malin, Sricharan Kumar

Abstract: Crafting an ideal prompt for Large Language Models (LLMs) is a challenging task that demands significant resources and expert human input. Existing work treats the optimization of prompt instruction and in-context learning examples as distinct problems, leading to sub-optimal prompt performance. This research addresses this limitation by establishing a unified in-context prompt optimization framew… ▽ More Crafting an ideal prompt for Large Language Models (LLMs) is a challenging task that demands significant resources and expert human input. Existing work treats the optimization of prompt instruction and in-context learning examples as distinct problems, leading to sub-optimal prompt performance. This research addresses this limitation by establishing a unified in-context prompt optimization framework, which aims to achieve joint optimization of the prompt instruction and examples. However, formulating such optimization in the discrete and high-dimensional natural language space introduces challenges in terms of convergence and computational efficiency. To overcome these issues, we present PhaseEvo, an efficient automatic prompt optimization framework that combines the generative capability of LLMs with the global search proficiency of evolution algorithms. Our framework features a multi-phase design incorporating innovative LLM-based mutation operators to enhance search efficiency and accelerate convergence. We conduct an extensive evaluation of our approach across 35 benchmark tasks. The results demonstrate that PhaseEvo significantly outperforms the state-of-the-art baseline methods by a large margin whilst maintaining good efficiency. △ Less

Submitted 17 February, 2024; originally announced February 2024.

Comments: 50 pages, 9 figures, 26 tables

arXiv:2402.08210 [pdf, other]

Quantum Computing-Enhanced Algorithm Unveils Novel Inhibitors for KRAS

Authors: Mohammad Ghazi Vakili, Christoph Gorgulla, AkshatKumar Nigam, Dmitry Bezrukov, Daniel Varoli, Alex Aliper, Daniil Polykovsky, Krishna M. Padmanabha Das, Jamie Snider, Anna Lyakisheva, Ardalan Hosseini Mansob, Zhong Yao, Lela Bitar, Eugene Radchenko, Xiao Ding, Jinxin Liu, Fanye Meng, Feng Ren, Yudong Cao, Igor Stagljar, Alán Aspuru-Guzik, Alex Zhavoronkov

Abstract: The discovery of small molecules with therapeutic potential is a long-standing challenge in chemistry and biology. Researchers have increasingly leveraged novel computational techniques to streamline the drug development process to increase hit rates and reduce the costs associated with bringing a drug to market. To this end, we introduce a quantum-classical generative model that seamlessly integr… ▽ More The discovery of small molecules with therapeutic potential is a long-standing challenge in chemistry and biology. Researchers have increasingly leveraged novel computational techniques to streamline the drug development process to increase hit rates and reduce the costs associated with bringing a drug to market. To this end, we introduce a quantum-classical generative model that seamlessly integrates the computational power of quantum algorithms trained on a 16-qubit IBM quantum computer with the established reliability of classical methods for designing small molecules. Our hybrid generative model was applied to designing new KRAS inhibitors, a crucial target in cancer therapy. We synthesized 15 promising molecules during our investigation and subjected them to experimental testing to assess their ability to engage with the target. Notably, among these candidates, two molecules, ISM061-018-2 and ISM061-22, each featuring unique scaffolds, stood out by demonstrating effective engagement with KRAS. ISM061-018-2 was identified as a broad-spectrum KRAS inhibitor, exhibiting a binding affinity to KRAS-G12D at $1.4 μM$. Concurrently, ISM061-22 exhibited specific mutant selectivity, displaying heightened activity against KRAS G12R and Q61H mutants. To our knowledge, this work shows for the first time the use of a quantum-generative model to yield experimentally confirmed biological hits, showcasing the practical potential of quantum-assisted drug discovery to produce viable therapeutics. Moreover, our findings reveal that the efficacy of distribution learning correlates with the number of qubits utilized, underlining the scalability potential of quantum computing resources. Overall, we anticipate our results to be a stepping stone towards developing more advanced quantum generative models in drug discovery. △ Less

Submitted 12 February, 2024; originally announced February 2024.

arXiv:2402.03735 [pdf, other]

Investigating the Utility of ChatGPT in the Issue Tracking System: An Exploratory Study

Authors: Joy Krishan Das, Saikat Mondal, Chanchal K. Roy

Abstract: Issue tracking systems serve as the primary tool for incorporating external users and customizing a software project to meet the users' requirements. However, the limited number of contributors and the challenge of identifying the best approach for each issue often impede effective resolution. Recently, an increasing number of developers are turning to AI tools like ChatGPT to enhance problem-solv… ▽ More Issue tracking systems serve as the primary tool for incorporating external users and customizing a software project to meet the users' requirements. However, the limited number of contributors and the challenge of identifying the best approach for each issue often impede effective resolution. Recently, an increasing number of developers are turning to AI tools like ChatGPT to enhance problem-solving efficiency. While previous studies have demonstrated the potential of ChatGPT in areas such as automatic program repair, debugging, and code generation, there is a lack of study on how developers explicitly utilize ChatGPT to resolve issues in their tracking system. Hence, this study aims to examine the interaction between ChatGPT and developers to analyze their prevalent activities and provide a resolution. In addition, we assess the code reliability by confirming if the code produced by ChatGPT was integrated into the project's codebase using the clone detection tool NiCad. Our investigation reveals that developers mainly use ChatGPT for brainstorming solutions but often opt to write their code instead of using ChatGPT-generated code, possibly due to concerns over the generation of "hallucinated code", as highlighted in the literature. △ Less

Submitted 6 February, 2024; originally announced February 2024.

Comments: Accepted in MSR 2024

arXiv:2402.02781 [pdf, other]

Dual Knowledge Distillation for Efficient Sound Event Detection

Authors: Yang Xiao, Rohan Kumar Das

Abstract: Sound event detection (SED) is essential for recognizing specific sounds and their temporal locations within acoustic signals. This becomes challenging particularly for on-device applications, where computational resources are limited. To address this issue, we introduce a novel framework referred to as dual knowledge distillation for developing efficient SED systems in this work. Our proposed dua… ▽ More Sound event detection (SED) is essential for recognizing specific sounds and their temporal locations within acoustic signals. This becomes challenging particularly for on-device applications, where computational resources are limited. To address this issue, we introduce a novel framework referred to as dual knowledge distillation for developing efficient SED systems in this work. Our proposed dual knowledge distillation commences with temporal-averaging knowledge distillation (TAKD), utilizing a mean student model derived from the temporal averaging of the student model's parameters. This allows the student model to indirectly learn from a pre-trained teacher model, ensuring a stable knowledge distillation. Subsequently, we introduce embedding-enhanced feature distillation (EEFD), which involves incorporating an embedding distillation layer within the student model to bolster contextual learning. On DCASE 2023 Task 4A public evaluation dataset, our proposed SED system with dual knowledge distillation having merely one-third of the baseline model's parameters, demonstrates superior performance in terms of PSDS1 and PSDS2. This highlights the importance of proposed dual knowledge distillation for compact SED systems, which can be ideal for edge devices. △ Less

Submitted 5 February, 2024; originally announced February 2024.

Comments: Accepted to ICASSP 2024 (Deep Neural Network Model Compression Workshop)

arXiv:2401.17390 [pdf, other]

Customizing Language Model Responses with Contrastive In-Context Learning

Authors: Xiang Gao, Kamalika Das

Abstract: Large language models (LLMs) are becoming increasingly important for machine learning applications. However, it can be challenging to align LLMs with our intent, particularly when we want to generate content that is preferable over others or when we want the LLM to respond in a certain style or tone that is hard to describe. To address this challenge, we propose an approach that uses contrastive e… ▽ More Large language models (LLMs) are becoming increasingly important for machine learning applications. However, it can be challenging to align LLMs with our intent, particularly when we want to generate content that is preferable over others or when we want the LLM to respond in a certain style or tone that is hard to describe. To address this challenge, we propose an approach that uses contrastive examples to better describe our intent. This involves providing positive examples that illustrate the true intent, along with negative examples that show what characteristics we want LLMs to avoid. The negative examples can be retrieved from labeled data, written by a human, or generated by the LLM itself. Before generating an answer, we ask the model to analyze the examples to teach itself what to avoid. This reasoning step provides the model with the appropriate articulation of the user's need and guides it towards generting a better answer. We tested our approach on both synthesized and real-world datasets, including StackExchange and Reddit, and found that it significantly improves performance compared to standard few-shot prompting △ Less

Submitted 8 April, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: Accepted to appear at AAAI 2024

arXiv:2401.15108 [pdf, other]

Tacit algorithmic collusion in deep reinforcement learning guided price competition: A study using EV charge pricing game

Authors: Diwas Paudel, Tapas K. Das

Abstract: Players in pricing games with complex structures are increasingly adopting artificial intelligence (AI) aided learning algorithms to make pricing decisions for maximizing profits. This is raising concern for the antitrust agencies as the practice of using AI may promote tacit algorithmic collusion among otherwise independent players. Recent studies of games in canonical forms have shown contrastin… ▽ More Players in pricing games with complex structures are increasingly adopting artificial intelligence (AI) aided learning algorithms to make pricing decisions for maximizing profits. This is raising concern for the antitrust agencies as the practice of using AI may promote tacit algorithmic collusion among otherwise independent players. Recent studies of games in canonical forms have shown contrasting claims ranging from none to a high level of tacit collusion among AI-guided players. In this paper, we examine the concern for tacit collusion by considering a practical game where EV charging hubs compete by dynamically varying their prices. Such a game is likely to be commonplace in the near future as EV adoption grows in all sectors of transportation. The hubs source power from the day-ahead (DA) and real-time (RT) electricity markets as well as from in-house battery storage systems. Their goal is to maximize profits via pricing and efficiently managing the cost of power usage. To aid our examination, we develop a two-step data-driven methodology. The first step obtains the DA commitment by solving a stochastic model. The second step generates the pricing strategies by solving a competitive Markov decision process model using a multi-agent deep reinforcement learning (MADRL) framework. We evaluate the resulting pricing strategies using an index for the level of tacit algorithmic collusion. An index value of zero indicates no collusion (perfect competition) and one indicates full collusion (monopolistic behavior). Results from our numerical case study yield collusion index values between 0.14 and 0.45, suggesting a low to moderate level of collusion. △ Less

Submitted 10 May, 2024; v1 submitted 25 January, 2024; originally announced January 2024.

arXiv:2401.07944 [pdf, ps, other]

SemEval-2017 Task 4: Sentiment Analysis in Twitter using BERT

Authors: Rupak Kumar Das, Dr. Ted Pedersen

Abstract: This paper uses the BERT model, which is a transformer-based architecture, to solve task 4A, English Language, Sentiment Analysis in Twitter of SemEval2017. BERT is a very powerful large language model for classification tasks when the amount of training data is small. For this experiment, we have used the BERT(BASE) model, which has 12 hidden layers. This model provides better accuracy, precision… ▽ More This paper uses the BERT model, which is a transformer-based architecture, to solve task 4A, English Language, Sentiment Analysis in Twitter of SemEval2017. BERT is a very powerful large language model for classification tasks when the amount of training data is small. For this experiment, we have used the BERT(BASE) model, which has 12 hidden layers. This model provides better accuracy, precision, recall, and f1 score than the Naive Bayes baseline model. It performs better in binary classification subtasks than the multi-class classification subtasks. We also considered all kinds of ethical issues during this experiment, as Twitter data contains personal and sensible information. The dataset and code used in our experiment can be found in this GitHub repository. △ Less

Submitted 19 June, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

arXiv:2401.02132 [pdf, other]

DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models

Authors: Wendi Cui, Jiaxin Zhang, Zhuohang Li, Lopez Damien, Kamalika Das, Bradley Malin, Sricharan Kumar

Abstract: Evaluating the quality and variability of text generated by Large Language Models (LLMs) poses a significant, yet unresolved research challenge. Traditional evaluation methods, such as ROUGE and BERTScore, which measure token similarity, often fail to capture the holistic semantic equivalence. This results in a low correlation with human judgments and intuition, which is especially problematic in… ▽ More Evaluating the quality and variability of text generated by Large Language Models (LLMs) poses a significant, yet unresolved research challenge. Traditional evaluation methods, such as ROUGE and BERTScore, which measure token similarity, often fail to capture the holistic semantic equivalence. This results in a low correlation with human judgments and intuition, which is especially problematic in high-stakes applications like healthcare and finance where reliability, safety, and robust decision-making are highly critical. This work proposes DCR, an automated framework for evaluating and improving the consistency of LLM-generated texts using a divide-conquer-reasoning approach. Unlike existing LLM-based evaluators that operate at the paragraph level, our method employs a divide-and-conquer evaluator (DCE) that breaks down the paragraph-to-paragraph comparison between two generated responses into individual sentence-to-paragraph comparisons, each evaluated based on predefined criteria. To facilitate this approach, we introduce an automatic metric converter (AMC) that translates the output from DCE into an interpretable numeric score. Beyond the consistency evaluation, we further present a reason-assisted improver (RAI) that leverages the analytical reasons with explanations identified by DCE to generate new responses aimed at reducing these inconsistencies. Through comprehensive and systematic empirical analysis, we show that our approach outperforms state-of-the-art methods by a large margin (e.g., +19.3% and +24.3% on the SummEval dataset) in evaluating the consistency of LLM generation across multiple benchmarks in semantic, factual, and summarization consistency tasks. Our approach also substantially reduces nearly 90% of output inconsistencies, showing promise for effective hallucination mitigation. △ Less

Submitted 4 January, 2024; originally announced January 2024.

arXiv:2401.00959 [pdf, other]

Creating an Intelligent Dementia-Friendly Living Space: A Feasibility Study Integrating Assistive Robotics, Wearable Sensors, and Spatial Technology

Authors: Arshia A Khan, Rupak Kumar Das, Anna Martin, Dale Dowling, Rana Imtiaz

Abstract: This study investigates the integration of assistive therapeutic robotics, wearable sensors, and spatial sensors within an intelligent environment tailored for dementia care. The feasibility study aims to assess the collective impact of these technologies in enhancing care giving by seamlessly integrating supportive technology in the background. The wearable sensors track physiological data, while… ▽ More This study investigates the integration of assistive therapeutic robotics, wearable sensors, and spatial sensors within an intelligent environment tailored for dementia care. The feasibility study aims to assess the collective impact of these technologies in enhancing care giving by seamlessly integrating supportive technology in the background. The wearable sensors track physiological data, while spatial sensors monitor geo-spatial information, integrated into a system supporting residents without necessitating technical expertise. The designed space fosters various activities, including robot interactions, medication delivery, physical exercises like walking on a treadmill (Bruce protocol), entertainment, and household tasks, promoting cognitive stimulation through puzzles. Physiological data revealed significant participant engagement during robot interactions, indicating the potential effectiveness of robot-assisted activities in enhancing the quality of life for residents. △ Less

Submitted 1 January, 2024; originally announced January 2024.

arXiv:2312.16276 [pdf, ps, other]

Duality for Fitting's Multi-valued Modal logic via bitopology and biVietoris coalgebra

Authors: Litan Kumar Das, Kumar Sankar Ray, Prakash Chandra Mali

Abstract: Fitting's Heyting-valued logic and Heyting-valued modal logic have already been studied from an algebraic viewpoint. In addition to algebraic axiomatizations with the completeness of Fitting's Heyting-valued logic and Heyting-valued modal logic, both topological and coalgebraic dualities have also been developed for algebras of Fitting's Heyting-valued modal logic. Bitopological methods have recen… ▽ More Fitting's Heyting-valued logic and Heyting-valued modal logic have already been studied from an algebraic viewpoint. In addition to algebraic axiomatizations with the completeness of Fitting's Heyting-valued logic and Heyting-valued modal logic, both topological and coalgebraic dualities have also been developed for algebras of Fitting's Heyting-valued modal logic. Bitopological methods have recently been employed to investigate duality for Fitting's Heyting-valued logic. However, the concepts of bitopology and biVietoris coalgebras are conspicuously absent from the development of dualities for Fitting's many-valued modal logic. With this study, we try to bridge that gap. We develop a bitopological duality for algebras of Fitting's Heyting-valued modal logic. We construct a bi-Vietoris functor on the category $PBS_{\mathcal{L}}$ of $\mathcal{L}$-valued ($\mathcal{L}$ is a Heyting algebra) pairwise Boolean spaces. Finally, we obtain a dual equivalence between categories of biVietoris coalgebras and algebras of Fitting's Heyting-valued modal logic. As a result, we conclude that Fitting's many-valued modal logic is sound and complete with respect to the coalgebras of a biVietoris functor. We discuss the application of this coalgebraic approach to bitopological duality. △ Less

Submitted 1 July, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

arXiv:2311.09625 [pdf, other]

DECDM: Document Enhancement using Cycle-Consistent Diffusion Models

Authors: Jiaxin Zhang, Joy Rimchala, Lalla Mouatadid, Kamalika Das, Sricharan Kumar

Abstract: The performance of optical character recognition (OCR) heavily relies on document image quality, which is crucial for automatic document processing and document intelligence. However, most existing document enhancement methods require supervised data pairs, which raises concerns about data separation and privacy protection, and makes it challenging to adapt these methods to new domain pairs. To ad… ▽ More The performance of optical character recognition (OCR) heavily relies on document image quality, which is crucial for automatic document processing and document intelligence. However, most existing document enhancement methods require supervised data pairs, which raises concerns about data separation and privacy protection, and makes it challenging to adapt these methods to new domain pairs. To address these issues, we propose DECDM, an end-to-end document-level image translation method inspired by recent advances in diffusion models. Our method overcomes the limitations of paired training by independently training the source (noisy input) and target (clean output) models, making it possible to apply domain-specific diffusion models to other pairs. DECDM trains on one dataset at a time, eliminating the need to scan both datasets concurrently, and effectively preserving data privacy from the source or target domain. We also introduce simple data augmentation strategies to improve character-glyph conservation during translation. We compare DECDM with state-of-the-art methods on multiple synthetic data and benchmark datasets, such as document denoising and {\color{black}shadow} removal, and demonstrate the superiority of performance quantitatively and qualitatively. △ Less

Submitted 16 November, 2023; originally announced November 2023.

Comments: Accepted by IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024)

arXiv:2311.01740 [pdf, other]

SAC3: Reliable Hallucination Detection in Black-Box Language Models via Semantic-aware Cross-check Consistency

Authors: Jiaxin Zhang, Zhuohang Li, Kamalika Das, Bradley A. Malin, Sricharan Kumar

Abstract: Hallucination detection is a critical step toward understanding the trustworthiness of modern language models (LMs). To achieve this goal, we re-examine existing detection approaches based on the self-consistency of LMs and uncover two types of hallucinations resulting from 1) question-level and 2) model-level, which cannot be effectively identified through self-consistency check alone. Building u… ▽ More Hallucination detection is a critical step toward understanding the trustworthiness of modern language models (LMs). To achieve this goal, we re-examine existing detection approaches based on the self-consistency of LMs and uncover two types of hallucinations resulting from 1) question-level and 2) model-level, which cannot be effectively identified through self-consistency check alone. Building upon this discovery, we propose a novel sampling-based method, i.e., semantic-aware cross-check consistency (SAC3) that expands on the principle of self-consistency checking. Our SAC3 approach incorporates additional mechanisms to detect both question-level and model-level hallucinations by leveraging advances including semantically equivalent question perturbation and cross-model response consistency checking. Through extensive and systematic empirical analysis, we demonstrate that SAC3 outperforms the state of the art in detecting both non-factual and factual statements across multiple question-answering and open-domain generation benchmarks. △ Less

Submitted 18 February, 2024; v1 submitted 3 November, 2023; originally announced November 2023.

Comments: EMNLP 2023

arXiv:2311.00724 [pdf]

Fraud Analytics Using Machine-learning & Engineering on Big Data (FAME) for Telecom

Authors: Sudarson Roy Pratihar, Subhadip Paul, Pranab Kumar Dash, Amartya Kumar Das

Abstract: Telecom industries lose globally 46.3 Billion USD due to fraud. Data mining and machine learning techniques (apart from rules oriented approach) have been used in past, but efficiency has been low as fraud pattern changes very rapidly. This paper presents an industrialized solution approach with self adaptive data mining technique and application of big data technologies to detect fraud and discov… ▽ More Telecom industries lose globally 46.3 Billion USD due to fraud. Data mining and machine learning techniques (apart from rules oriented approach) have been used in past, but efficiency has been low as fraud pattern changes very rapidly. This paper presents an industrialized solution approach with self adaptive data mining technique and application of big data technologies to detect fraud and discover novel fraud patterns in accurate, efficient and cost effective manner. Solution has been successfully demonstrated to detect International Revenue Share Fraud with <5% false positive. More than 1 Terra Bytes of Call Detail Record from a reputed wholesale carrier and overseas telecom transit carrier has been used to conduct this study. △ Less

Submitted 31 October, 2023; originally announced November 2023.

Comments: Presented in International Conference in Indian Institute of Management, Bangalore, India

arXiv:2310.20153 [pdf, other]

Interactive Multi-fidelity Learning for Cost-effective Adaptation of Language Model with Sparse Human Supervision

Authors: Jiaxin Zhang, Zhuohang Li, Kamalika Das, Sricharan Kumar

Abstract: Large language models (LLMs) have demonstrated remarkable capabilities in various tasks. However, their suitability for domain-specific tasks, is limited due to their immense scale at deployment, susceptibility to misinformation, and more importantly, high data annotation costs. We propose a novel Interactive Multi-Fidelity Learning (IMFL) framework for the cost-effective development of small doma… ▽ More Large language models (LLMs) have demonstrated remarkable capabilities in various tasks. However, their suitability for domain-specific tasks, is limited due to their immense scale at deployment, susceptibility to misinformation, and more importantly, high data annotation costs. We propose a novel Interactive Multi-Fidelity Learning (IMFL) framework for the cost-effective development of small domain-specific LMs under limited annotation budgets. Our approach formulates the domain-specific fine-tuning process as a multi-fidelity learning problem, focusing on identifying the optimal acquisition strategy that balances between low-fidelity automatic LLM annotations and high-fidelity human annotations to maximize model performance. We further propose an exploration-exploitation query strategy that enhances annotation diversity and informativeness, incorporating two innovative designs: 1) prompt retrieval that selects in-context examples from human-annotated samples to improve LLM annotation, and 2) variable batch size that controls the order for choosing each fidelity to facilitate knowledge distillation, ultimately enhancing annotation quality. Extensive experiments on financial and medical tasks demonstrate that IMFL achieves superior performance compared with single fidelity annotations. Given a limited budget of human annotation, IMFL significantly outperforms the human annotation baselines in all four tasks and achieves very close performance as human annotations on two of the tasks. These promising results suggest that the high human annotation costs in domain-specific tasks can be significantly reduced by employing IMFL, which utilizes fewer human annotations, supplemented with cheaper and faster LLM (e.g., GPT-3.5) annotations to achieve comparable performance. △ Less

Submitted 30 October, 2023; originally announced October 2023.

Comments: This work has been accepted by NeurIPS 2023

arXiv:2310.18660 [pdf, other]

Foundation Models for Generalist Geospatial Artificial Intelligence

Authors: Johannes Jakubik, Sujit Roy, C. E. Phillips, Paolo Fraccaro, Denys Godwin, Bianca Zadrozny, Daniela Szwarcman, Carlos Gomes, Gabby Nyirjesy, Blair Edwards, Daiki Kimura, Naomi Simumba, Linsong Chu, S. Karthik Mukkavilli, Devyani Lambhate, Kamal Das, Ranjini Bangalore, Dario Oliveira, Michal Muszynski, Kumar Ankur, Muthukumaran Ramasubramanian, Iksha Gurung, Sam Khallaghi, Hanxi, Li , et al. (8 additional authors not shown)

Abstract: Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framewo… ▽ More Significant progress in the development of highly adaptable and reusable Artificial Intelligence (AI) models is expected to have a significant impact on Earth science and remote sensing. Foundation models are pre-trained on large unlabeled datasets through self-supervision, and then fine-tuned for various downstream tasks with small labeled datasets. This paper introduces a first-of-a-kind framework for the efficient pre-training and fine-tuning of foundational models on extensive geospatial data. We have utilized this framework to create Prithvi, a transformer-based geospatial foundational model pre-trained on more than 1TB of multispectral satellite imagery from the Harmonized Landsat-Sentinel 2 (HLS) dataset. Our study demonstrates the efficacy of our framework in successfully fine-tuning Prithvi to a range of Earth observation tasks that have not been tackled by previous work on foundation models involving multi-temporal cloud gap imputation, flood mapping, wildfire scar segmentation, and multi-temporal crop segmentation. Our experiments show that the pre-trained model accelerates the fine-tuning process compared to leveraging randomly initialized weights. In addition, pre-trained Prithvi compares well against the state-of-the-art, e.g., outperforming a conditional GAN model in multi-temporal cloud imputation by up to 5pp (or 5.7%) in the structural similarity index. Finally, due to the limited availability of labeled data in the field of Earth observation, we gradually reduce the quantity of available labeled data for refining the model to evaluate data efficiency and demonstrate that data can be decreased significantly without affecting the model's accuracy. The pre-trained 100 million parameter model and corresponding fine-tuning workflows have been released publicly as open source contributions to the global Earth sciences community through Hugging Face. △ Less

Submitted 8 November, 2023; v1 submitted 28 October, 2023; originally announced October 2023.

arXiv:2310.09177 [pdf, ps, other]

doi 10.3390/s24082509

Future Industrial Applications: Exploring LPWAN-Driven IoT Protocols

Authors: Mahbubul Islam, Hossain Md. Mubashshir Jamil, Samiul Ahsan Pranto, Rupak Kumar Das, Al Amin, Arshia Khan

Abstract: The Internet of Things (IoT) will bring about the next industrial revolution in Industry 4.0. The communication aspect of IoT devices is one of the most critical factors in choosing the suitable device for the suitable usage. So far, the IoT physical layer communication challenges have been met with various communications protocols that provide varying strengths and weaknesses. Moreover, most of t… ▽ More The Internet of Things (IoT) will bring about the next industrial revolution in Industry 4.0. The communication aspect of IoT devices is one of the most critical factors in choosing the suitable device for the suitable usage. So far, the IoT physical layer communication challenges have been met with various communications protocols that provide varying strengths and weaknesses. Moreover, most of them are wireless protocols due to the sheer number of device requirements for IoT. This paper summarizes the network architectures of some of the most popular IoT wireless communications protocols. It also presents a comparative analysis of critical features, including power consumption, coverage, data rate, security, cost, and Quality of Service (QoS). This comparative study shows that Low Power Wide Area Network (LPWAN) based IoT protocols (LoRa, Sigfox, NB-IoT, LTE-M ) are more suitable for future industrial applications because of their energy efficiency, high coverage, and cost efficiency. In addition, the study also presents an industrial Internet of Things (IIoT) application perspective on the suitability of LPWAN protocols in a particular scenario and addresses some open issues that need to be researched. Thus, this study can assist in deciding the most suitable protocol for an industrial and production field. △ Less

Submitted 19 January, 2024; v1 submitted 13 October, 2023; originally announced October 2023.

Report number: s24082509

Journal ref: Sensors 2024, 24, 2509

arXiv:2309.13691 [pdf, other]

On Simultaneous Information and Energy Transmission through Quantum Channels

Authors: Bishal Kumar Das, Lav R. Varshney, Vaibhav Madhok

Abstract: The optimal rate at which information can be sent through a quantum channel when the transmitted signal must simultaneously carry some minimum amount of energy is characterized. To do so, we introduce the quantum-classical analogue of the capacity-power function and generalize results in classical information theory for transmitting classical information through noisy channels. We show that the ca… ▽ More The optimal rate at which information can be sent through a quantum channel when the transmitted signal must simultaneously carry some minimum amount of energy is characterized. To do so, we introduce the quantum-classical analogue of the capacity-power function and generalize results in classical information theory for transmitting classical information through noisy channels. We show that the capacity-power function for a quantum channel, for both unassisted and private protocol, is concave and also prove additivity for unentangled and uncorrelated ensembles of input signals. This implies we do not need regularized formulas for calculation. We numerically demonstrate these properties for some standard channel models. We obtain analytical expressions for the capacity-power function for the case of noiseless channels using properties of random quantum states and concentration phenomenon in large Hilbert spaces. △ Less

Submitted 13 October, 2023; v1 submitted 24 September, 2023; originally announced September 2023.

Comments: 13 pages, 16 figures

arXiv:2308.12614 [pdf, ps, other]

Obstruction characterization of co-TT graphs

Authors: Ashok Kumar Das, Indrajit Paul

Abstract: Threshold tolerance graphs and their complement graphs ( known as co-TT graphs) were introduced by Monma, Reed and Trotter[24]. Introducing the concept of negative interval Hell et al.[19] defined signed-interval bigraphs/digraphs and have shown that they are equivalent to several seemingly different classes of bigraphs/digraphs. They have also shown that co-TT graphs are equivalent to symmetric s… ▽ More Threshold tolerance graphs and their complement graphs ( known as co-TT graphs) were introduced by Monma, Reed and Trotter[24]. Introducing the concept of negative interval Hell et al.[19] defined signed-interval bigraphs/digraphs and have shown that they are equivalent to several seemingly different classes of bigraphs/digraphs. They have also shown that co-TT graphs are equivalent to symmetric signed-interval digraphs. In this paper we characterize signed-interval bigraphs and signed-interval graphs respectively in terms of their biadjacency matrices and adjacency matrices. Finally, based on the geometric representation of signed-interval graphs we have setteled the open problem of forbidden induced subgraph characterization of co-TT graphs posed by Monma, Reed and Trotter in the same paper. △ Less

Submitted 24 August, 2023; originally announced August 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2206.05917

arXiv:2308.06272 [pdf, other]

Beyond Reality: The Pivotal Role of Generative AI in the Metaverse

Authors: Vinay Chamola, Gaurang Bansal, Tridib Kumar Das, Vikas Hassija, Naga Siva Sai Reddy, Jiacheng Wang, Sherali Zeadally, Amir Hussain, F. Richard Yu, Mohsen Guizani, Dusit Niyato

Abstract: Imagine stepping into a virtual world that's as rich, dynamic, and interactive as our physical one. This is the promise of the Metaverse, and it's being brought to life by the transformative power of Generative Artificial Intelligence (AI). This paper offers a comprehensive exploration of how generative AI technologies are shaping the Metaverse, transforming it into a dynamic, immersive, and inter… ▽ More Imagine stepping into a virtual world that's as rich, dynamic, and interactive as our physical one. This is the promise of the Metaverse, and it's being brought to life by the transformative power of Generative Artificial Intelligence (AI). This paper offers a comprehensive exploration of how generative AI technologies are shaping the Metaverse, transforming it into a dynamic, immersive, and interactive virtual world. We delve into the applications of text generation models like ChatGPT and GPT-3, which are enhancing conversational interfaces with AI-generated characters. We explore the role of image generation models such as DALL-E and MidJourney in creating visually stunning and diverse content. We also examine the potential of 3D model generation technologies like Point-E and Lumirithmic in creating realistic virtual objects that enrich the Metaverse experience. But the journey doesn't stop there. We also address the challenges and ethical considerations of implementing these technologies in the Metaverse, offering insights into the balance between user control and AI automation. This paper is not just a study, but a guide to the future of the Metaverse, offering readers a roadmap to harnessing the power of generative AI in creating immersive virtual worlds. △ Less

Submitted 28 July, 2023; originally announced August 2023.

Comments: 8 pages, 4 figures

arXiv:2308.00943 [pdf, other]

IIDS: Design of Intelligent Intrusion Detection System for Internet-of-Things Applications

Authors: KG Raghavendra Narayan, Srijanee Mookherji, Vanga Odelu, Rajendra Prasath, Anish Chand Turlapaty, Ashok Kumar Das

Abstract: With rapid technological growth, security attacks are drastically increasing. In many crucial Internet-of-Things (IoT) applications such as healthcare and defense, the early detection of security attacks plays a significant role in protecting huge resources. An intrusion detection system is used to address this problem. The signature-based approaches fail to detect zero-day attacks. So anomaly-bas… ▽ More With rapid technological growth, security attacks are drastically increasing. In many crucial Internet-of-Things (IoT) applications such as healthcare and defense, the early detection of security attacks plays a significant role in protecting huge resources. An intrusion detection system is used to address this problem. The signature-based approaches fail to detect zero-day attacks. So anomaly-based detection particularly AI tools, are becoming popular. In addition, the imbalanced dataset leads to biased results. In Machine Learning (ML) models, F1 score is an important metric to measure the accuracy of class-level correct predictions. The model may fail to detect the target samples if the F1 is considerably low. It will lead to unrecoverable consequences in sensitive applications such as healthcare and defense. So, any improvement in the F1 score has significant impact on the resource protection. In this paper, we present a framework for ML-based intrusion detection system for an imbalanced dataset. In this study, the most recent dataset, namely CICIoT2023 is considered. The random forest (RF) algorithm is used in the proposed framework. The proposed approach improves 3.72%, 3.75% and 4.69% in precision, recall and F1 score, respectively, with the existing method. Additionally, for unsaturated classes (i.e., classes with F1 score < 0.99), F1 score improved significantly by 7.9%. As a result, the proposed approach is more suitable for IoT security applications for efficient detection of intrusion and is useful in further studies. △ Less

Submitted 2 August, 2023; originally announced August 2023.

arXiv:2308.00107 [pdf, other]

Validation of a Zero-Shot Learning Natural Language Processing Tool for Data Abstraction from Unstructured Healthcare Data

Authors: Basil Kaufmann, Dallin Busby, Chandan Krushna Das, Neeraja Tillu, Mani Menon, Ashutosh K. Tewari, Michael A. Gorin

Abstract: Objectives: To describe the development and validation of a zero-shot learning natural language processing (NLP) tool for abstracting data from unstructured text contained within PDF documents, such as those found within electronic health records. Materials and Methods: A data abstraction tool based on the GPT-3.5 model from OpenAI was developed and compared to three physician human abstractors in… ▽ More Objectives: To describe the development and validation of a zero-shot learning natural language processing (NLP) tool for abstracting data from unstructured text contained within PDF documents, such as those found within electronic health records. Materials and Methods: A data abstraction tool based on the GPT-3.5 model from OpenAI was developed and compared to three physician human abstractors in terms of time to task completion and accuracy for abstracting data on 14 unique variables from a set of 199 de-identified radical prostatectomy pathology reports. The reports were processed by the software tool in vectorized and scanned formats to establish the impact of optical character recognition on data abstraction. The tool was assessed for superiority for data abstraction speed and non-inferiority for accuracy. Results: The human abstractors required a mean of 101s per report for data abstraction, with times varying from 15 to 284 s. In comparison, the software tool required a mean of 12.8 s to process the vectorized reports and a mean of 15.8 to process the scanned reports (P < 0.001). The overall accuracies of the three human abstractors were 94.7%, 97.8%, and 96.4% for the combined set of 2786 datapoints. The software tool had an overall accuracy of 94.2% for the vectorized reports, proving to be non-inferior to the human abstractors at a margin of -10% ($α$=0.025). The tool had a slightly lower accuracy of 88.7% using the scanned reports, proving to be non-inferiority to 2 out of 3 human abstractors. Conclusion: The developed zero-shot learning NLP tool affords researchers comparable levels of accuracy to that of human abstractors, with significant time savings benefits. Because of the lack of need for task-specific model training, the developed tool is highly generalizable and can be used for a wide variety of data abstraction tasks, even outside the field of medicine. △ Less

Submitted 23 July, 2023; originally announced August 2023.

Comments: 10 pages, 3 figures, 1 table, 3 supplementary figures

MSC Class: 68T50 ACM Class: I.2.7

arXiv:2307.16262 [pdf, other]

Validating polyp and instrument segmentation methods in colonoscopy through Medico 2020 and MedAI 2021 Challenges

Authors: Debesh Jha, Vanshali Sharma, Debapriya Banik, Debayan Bhattacharya, Kaushiki Roy, Steven A. Hicks, Nikhil Kumar Tomar, Vajira Thambawita, Adrian Krenzer, Ge-Peng Ji, Sahadev Poudel, George Batchkala, Saruar Alam, Awadelrahman M. A. Ahmed, Quoc-Huy Trinh, Zeshan Khan, Tien-Phat Nguyen, Shruti Shrestha, Sabari Nathan, Jeonghwan Gwak, Ritika K. Jha, Zheyuan Zhang, Alexander Schlaefer, Debotosh Bhattacharjee, M. K. Bhuyan , et al. (8 additional authors not shown)

Abstract: Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has… ▽ More Automatic analysis of colonoscopy images has been an active field of research motivated by the importance of early detection of precancerous polyps. However, detecting polyps during the live examination can be challenging due to various factors such as variation of skills and experience among the endoscopists, lack of attentiveness, and fatigue leading to a high polyp miss-rate. Deep learning has emerged as a promising solution to this challenge as it can assist endoscopists in detecting and classifying overlooked polyps and abnormalities in real time. In addition to the algorithm's accuracy, transparency and interpretability are crucial to explaining the whys and hows of the algorithm's prediction. Further, most algorithms are developed in private data, closed source, or proprietary software, and methods lack reproducibility. Therefore, to promote the development of efficient and transparent methods, we have organized the "Medico automatic polyp segmentation (Medico 2020)" and "MedAI: Transparency in Medical Image Segmentation (MedAI 2021)" competitions. We present a comprehensive summary and analyze each contribution, highlight the strength of the best-performing methods, and discuss the possibility of clinical translations of such methods into the clinic. For the transparency task, a multi-disciplinary team, including expert gastroenterologists, accessed each submission and evaluated the team based on open-source practices, failure case analysis, ablation studies, usability and understandability of evaluations to gain a deeper understanding of the models' credibility for clinical deployment. Through the comprehensive analysis of the challenge, we not only highlight the advancements in polyp and surgical instrument segmentation but also encourage qualitative evaluation for building more transparent and understandable AI-based colonoscopy systems. △ Less

Submitted 6 May, 2024; v1 submitted 30 July, 2023; originally announced July 2023.

arXiv:2307.08140 [pdf, other]

GastroVision: A Multi-class Endoscopy Image Dataset for Computer Aided Gastrointestinal Disease Detection

Authors: Debesh Jha, Vanshali Sharma, Neethi Dasu, Nikhil Kumar Tomar, Steven Hicks, M. K. Bhuyan, Pradip K. Das, Michael A. Riegler, Pål Halvorsen, Ulas Bagci, Thomas de Lange

Abstract: Integrating real-time artificial intelligence (AI) systems in clinical practices faces challenges such as scalability and acceptance. These challenges include data availability, biased outcomes, data quality, lack of transparency, and underperformance on unseen datasets from different distributions. The scarcity of large-scale, precisely labeled, and diverse datasets are the major challenge for cl… ▽ More Integrating real-time artificial intelligence (AI) systems in clinical practices faces challenges such as scalability and acceptance. These challenges include data availability, biased outcomes, data quality, lack of transparency, and underperformance on unseen datasets from different distributions. The scarcity of large-scale, precisely labeled, and diverse datasets are the major challenge for clinical integration. This scarcity is also due to the legal restrictions and extensive manual efforts required for accurate annotations from clinicians. To address these challenges, we present \textit{GastroVision}, a multi-center open-access gastrointestinal (GI) endoscopy dataset that includes different anatomical landmarks, pathological abnormalities, polyp removal cases and normal findings (a total of 27 classes) from the GI tract. The dataset comprises 8,000 images acquired from Bærum Hospital in Norway and Karolinska University Hospital in Sweden and was annotated and verified by experienced GI endoscopists. Furthermore, we validate the significance of our dataset with extensive benchmarking based on the popular deep learning based baseline models. We believe our dataset can facilitate the development of AI-based algorithms for GI disease detection and classification. Our dataset is available at \url{https://osf.io/84e7f/}. △ Less

Submitted 17 August, 2023; v1 submitted 16 July, 2023; originally announced July 2023.

arXiv:2307.05390 [pdf, other]

CrysMMNet: Multimodal Representation for Crystal Property Prediction

Authors: Kishalay Das, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, Niloy Ganguly

Abstract: Machine Learning models have emerged as a powerful tool for fast and accurate prediction of different crystalline properties. Exiting state-of-the-art models rely on a single modality of crystal data i.e. crystal graph structure, where they construct multi-graph by establishing edges between nearby atoms in 3D space and apply GNN to learn materials representation. Thereby, they encode local chemic… ▽ More Machine Learning models have emerged as a powerful tool for fast and accurate prediction of different crystalline properties. Exiting state-of-the-art models rely on a single modality of crystal data i.e. crystal graph structure, where they construct multi-graph by establishing edges between nearby atoms in 3D space and apply GNN to learn materials representation. Thereby, they encode local chemical semantics around the atoms successfully but fail to capture important global periodic structural information like space group number, crystal symmetry, rotational information, etc, which influence different crystal properties. In this work, we leverage textual descriptions of materials to model global structural information into graph structure and learn a more robust and enriched representation of crystalline materials. To this effect, we first curate a textual dataset for crystalline material databases containing descriptions of each material. Further, we propose CrysMMNet, a simple multi-modal framework, which fuses both structural and textual representation together to generate a joint multimodal representation of crystalline materials. We conduct extensive experiments on two benchmark datasets across ten different properties to show that CrysMMNet outperforms existing state-of-the-art baseline methods with a good margin. We also observe that fusing the textual representation with crystal graph structure provides consistent improvement for all the SOTA GNN models compared to their own vanilla versions. We have shared the textual dataset, that we have curated for both the benchmark material databases, with the community for future use. △ Less

Submitted 9 June, 2023; originally announced July 2023.

Comments: 14 pages, 4 fifures

arXiv:2306.03785 [pdf, other]

Development of On-Ground Hardware In Loop Simulation Facility for Space Robotics

Authors: Roshan Sah, Raunak Srivastava, Kaushik Das

Abstract: Over a couple of decades, space junk has increased rapidly, which has caused significant threats to the LEO operation satellites. An Active Debris Removal $(ADR)$ concept continuously evolves for space junk removal. One of the ADR methods is Space Robotics, whose function is to chase, capture and de-orbit the space junk. This paper presents the development of an on-ground space robotics facility i… ▽ More Over a couple of decades, space junk has increased rapidly, which has caused significant threats to the LEO operation satellites. An Active Debris Removal $(ADR)$ concept continuously evolves for space junk removal. One of the ADR methods is Space Robotics, whose function is to chase, capture and de-orbit the space junk. This paper presents the development of an on-ground space robotics facility in the TCS Research for on-orbit servicing $(OOS)$ like refueling and debris capture experiments. A Hardware in Loop Simulation (HILS) system will be used for integrated system development, testing, and demonstration of on-orbit docking mechanisms. The HiLS test facility of TCS Research Lab will use two URs in which one UR is attached to the RG2 gripper, and the other is attached to a force-torque sensor and with a scaled mock-up model. The first UR5 will be mounted on a 7-axis linear rail and contain the docking probe. First, UR5 with a suitable gripper has to interface its control boxes. The grasping algorithm was run through the ROS interface line to demonstrate and validate the on-orbit operations. The manipulator will be mounted with LIDAR and a camera to visualize the mock-up model, find the target model's pose and rotational velocity estimation, and a gripper that will move relative to the target model. The other manipulator has the UR10 control, providing rotational and random motion to the mockup, enabling a dynamic simulator fed by force-torque data. The dynamic simulator is fed up with the orbit propagator, which will provide the orbiting environment to the target model. For the simulation of the docking and grasping of the target model, a linear rail of a 6m setup is still in the procurement process. Once reaching proximity, the grasping algorithm will be launched to capture the target model after reading the random motion of the mock-up model. △ Less

Submitted 3 June, 2023; originally announced June 2023.

Comments: 11 pages, 15 figures, Accepted at Small Satellite Conference 2023; Weekday Sessions: Orbital Debris, SSA & STM; Tuesday, 8th Aug 2023

arXiv:2305.15901 [pdf, other]

Consistent Optimal Transport with Empirical Conditional Measures

Authors: Piyushi Manupriya, Rachit Keerti Das, Sayantan Biswas, Saketha Nath Jagarlapudi

Abstract: Given samples from two joint distributions, we consider the problem of Optimal Transportation (OT) between them when conditioned on a common variable. We focus on the general setting where the conditioned variable may be continuous, and the marginals of this variable in the two joint distributions may not be the same. In such settings, standard OT variants cannot be employed, and novel estimation… ▽ More Given samples from two joint distributions, we consider the problem of Optimal Transportation (OT) between them when conditioned on a common variable. We focus on the general setting where the conditioned variable may be continuous, and the marginals of this variable in the two joint distributions may not be the same. In such settings, standard OT variants cannot be employed, and novel estimation techniques are necessary. Since the main challenge is that the conditional distributions are not explicitly available, the key idea in our OT formulation is to employ kernelized-least-squares terms computed over the joint samples, which implicitly match the transport plan's marginals with the empirical conditionals. Under mild conditions, we prove that our estimated transport plans, as a function of the conditioned variable, are asymptotically optimal. For finite samples, we show that the deviation in terms of our regularized objective is bounded by $O(1/m^{1/4})$, where $m$ is the number of samples. We also discuss how the conditional transport plan could be modelled using explicit probabilistic models as well as using implicit generative ones. We empirically verify the consistency of our estimator on synthetic datasets, where the optimal plan is analytically known. When employed in applications like prompt learning for few-shot classification and conditional-generation in the context of predicting cell responses to treatment, our methodology improves upon state-of-the-art methods. △ Less

Submitted 10 June, 2024; v1 submitted 25 May, 2023; originally announced May 2023.

arXiv:2305.11039 [pdf, other]

Deep PackGen: A Deep Reinforcement Learning Framework for Adversarial Network Packet Generation

Authors: Soumyadeep Hore, Jalal Ghadermazi, Diwas Paudel, Ankit Shah, Tapas K. Das, Nathaniel D. Bastian

Abstract: Recent advancements in artificial intelligence (AI) and machine learning (ML) algorithms, coupled with the availability of faster computing infrastructure, have enhanced the security posture of cybersecurity operations centers (defenders) through the development of ML-aided network intrusion detection systems (NIDS). Concurrently, the abilities of adversaries to evade security have also increased… ▽ More Recent advancements in artificial intelligence (AI) and machine learning (ML) algorithms, coupled with the availability of faster computing infrastructure, have enhanced the security posture of cybersecurity operations centers (defenders) through the development of ML-aided network intrusion detection systems (NIDS). Concurrently, the abilities of adversaries to evade security have also increased with the support of AI/ML models. Therefore, defenders need to proactively prepare for evasion attacks that exploit the detection mechanisms of NIDS. Recent studies have found that the perturbation of flow-based and packet-based features can deceive ML models, but these approaches have limitations. Perturbations made to the flow-based features are difficult to reverse-engineer, while samples generated with perturbations to the packet-based features are not playable. Our methodological framework, Deep PackGen, employs deep reinforcement learning to generate adversarial packets and aims to overcome the limitations of approaches in the literature. By taking raw malicious network packets as inputs and systematically making perturbations on them, Deep PackGen camouflages them as benign packets while still maintaining their functionality. In our experiments, using publicly available data, Deep PackGen achieved an average adversarial success rate of 66.4\% against various ML models and across different attack types. Our investigation also revealed that more than 45\% of the successful adversarial samples were out-of-distribution packets that evaded the decision boundaries of the classifiers. The knowledge gained from our study on the adversary's ability to make specific evasive perturbations to different types of malicious packets can help defenders enhance the robustness of their NIDS against evolving adversarial attacks. △ Less

Submitted 18 May, 2023; originally announced May 2023.

arXiv:2305.00315 [pdf, other]

Decentralised Identity Federations using Blockchain

Authors: Mirza Kamrul Bashar Shuhan, Syed Md. Hasnayeen, Tanmoy Krishna Das, Md. Nazmus Sakib, Md Sadek Ferdous

Abstract: Federated Identity Management has proven its worth by offering economic benefits and convenience to Service Providers and users alike. In such federations, the Identity Provider (IdP) is the solitary entity responsible for managing user credentials and generating assertions for the users, who are requesting access to a service provider's resource. This makes the IdP centralised and exhibits a sing… ▽ More Federated Identity Management has proven its worth by offering economic benefits and convenience to Service Providers and users alike. In such federations, the Identity Provider (IdP) is the solitary entity responsible for managing user credentials and generating assertions for the users, who are requesting access to a service provider's resource. This makes the IdP centralised and exhibits a single point of failure for the federation, making the federation prone to catastrophic damages. The paper presents our effort in designing and implementing a decentralised system in establishing an identity federation. In its attempt to decentralise the IdP in the federation, the proposed system relies on blockchain technology, thereby mitigating the single point of failure shortcoming of existing identity federations. The system is designed using a set of requirements In this article, we explore different aspects of designing and developing the system, present its protocol flow, analyse its performance, and evaluate its security using ProVerif, a state-of-the-art formal protocol verification tool. △ Less

Submitted 29 April, 2023; originally announced May 2023.

arXiv:2304.14269 [pdf]

Performance Analysis of DNA Crossbar Arrays for High-Density Memory Storage Applications

Authors: Arpan De, Hashem Mohammad, Yiren Wang, Rajkumar Kubendran, Arindam K. Das, M. P. Anantram

Abstract: Deoxyribonucleic acid (DNA) has emerged as a promising building block for next-generation ultra-high density storage devices. Although DNA has high durability and extremely high density in nature, its potential as the basis of storage devices is currently hindered by limitations such as expensive and complex fabrication processes and time-consuming read-write operations. In this article, we propos… ▽ More Deoxyribonucleic acid (DNA) has emerged as a promising building block for next-generation ultra-high density storage devices. Although DNA has high durability and extremely high density in nature, its potential as the basis of storage devices is currently hindered by limitations such as expensive and complex fabrication processes and time-consuming read-write operations. In this article, we propose the use of a DNA crossbar array architecture for an electrically readable Read-Only Memory (DNA-ROM). While information can be written error-free to a DNA-ROM array using appropriate sequence encoding, its read accuracy can be affected by several factors such as array size, interconnect resistance, and Fermi energy deviations from HOMO levels of DNA strands employed in the crossbar. We study the impact of array size and interconnect resistance on the bit error rate of a DNA-ROM array through extensive Monte Carlo simulations. We have also analyzed the performance of our proposed DNA crossbar array for an image storage application, as a function of array size and interconnect resistance. While we expect that future advances in bioengineering and materials science will address some of the fabrication challenges associated with DNA crossbar arrays, we believe that the comprehensive body of results we present in this paper establishes the technical viability of DNA crossbar arrays as low-power, high-density storage devices. Finally, our analysis of array performance vis-a-vis interconnect resistance should provide valuable insights into aspects of the fabrication process such as the proper choice of interconnects necessary for ensuring high read accuracies. △ Less

Submitted 6 April, 2023; originally announced April 2023.

arXiv:2304.07679 [pdf]

doi 10.1145/3580252.3586972

Using Geographic Location-based Public Health Features in Survival Analysis

Authors: Navid Seidi, Ardhendu Tripathy, Sajal K. Das

Abstract: Time elapsed till an event of interest is often modeled using the survival analysis methodology, which estimates a survival score based on the input features. There is a resurgence of interest in developing more accurate prediction models for time-to-event prediction in personalized healthcare using modern tools such as neural networks. Higher quality features and more frequent observations improv… ▽ More Time elapsed till an event of interest is often modeled using the survival analysis methodology, which estimates a survival score based on the input features. There is a resurgence of interest in developing more accurate prediction models for time-to-event prediction in personalized healthcare using modern tools such as neural networks. Higher quality features and more frequent observations improve the predictions for a patient, however, the impact of including a patient's geographic location-based public health statistics on individual predictions has not been studied. This paper proposes a complementary improvement to survival analysis models by incorporating public health statistics in the input features. We show that including geographic location-based public health information results in a statistically significant improvement in the concordance index evaluated on the Surveillance, Epidemiology, and End Results (SEER) dataset containing nationwide cancer incidence data. The improvement holds for both the standard Cox proportional hazards model and the state-of-the-art Deep Survival Machines model. Our results indicate the utility of geographic location-based public health features in survival analysis. △ Less

Submitted 15 April, 2023; originally announced April 2023.

Journal ref: 2023 IEEE/ACM Conference on Connected Health: Applications, Systems and Engineering Technologies (CHASE), 2023, 80-91

arXiv:2304.02152 [pdf, other]

Can Adversarial Networks Make Uninformative Colonoscopy Video Frames Clinically Informative?

Authors: Vanshali Sharma, M. K. Bhuyan, Pradip K. Das

Abstract: Various artifacts, such as ghost colors, interlacing, and motion blur, hinder diagnosing colorectal cancer (CRC) from videos acquired during colonoscopy. The frames containing these artifacts are called uninformative frames and are present in large proportions in colonoscopy videos. To alleviate the impact of artifacts, we propose an adversarial network based framework to convert uninformative fra… ▽ More Various artifacts, such as ghost colors, interlacing, and motion blur, hinder diagnosing colorectal cancer (CRC) from videos acquired during colonoscopy. The frames containing these artifacts are called uninformative frames and are present in large proportions in colonoscopy videos. To alleviate the impact of artifacts, we propose an adversarial network based framework to convert uninformative frames to clinically relevant frames. We examine the effectiveness of the proposed approach by evaluating the translated frames for polyp detection using YOLOv5. Preliminary results present improved detection performance along with elegant qualitative outcomes. We also examine the failure cases to determine the directions for future work. △ Less

Submitted 4 April, 2023; originally announced April 2023.

Comments: Student Abstract, Accepted at AAAI 2023

arXiv:2302.13552 [pdf, other]

Dispatching Point Selection for a Drone-Based Delivery System Operating in a Mixed Euclidean-Manhattan Grid

Authors: Francesco Betti Sorbelli, Federico Corò, Sajal K. Das, Cristina M. Pinotti, Anil Shende

Abstract: In this paper, we present a drone-based delivery system that assumes to deal with two different mixed-areas, i.e., rural and urban. In these mixed-areas, called EM-grids, the distances are measured with two different metrics, and the shortest path between two destinations concatenates the Euclidean and Manhattan metrics. Due to payload constraints, the drone serves a single customer at a time retu… ▽ More In this paper, we present a drone-based delivery system that assumes to deal with two different mixed-areas, i.e., rural and urban. In these mixed-areas, called EM-grids, the distances are measured with two different metrics, and the shortest path between two destinations concatenates the Euclidean and Manhattan metrics. Due to payload constraints, the drone serves a single customer at a time returning back to the dispatching point (DP) after each delivery to load a new parcel for the next customer. In this paper, we present the 1-Median Euclidean-Manhattan grid Problem (MEMP) for EM-grids, whose goal is to determine the drone's DP position that minimizes the sum of the distances between all the locations to be served and the point itself. We study the MEMP on two different scenarios, i.e., one in which all the customers in the area need to be served (full-grid) and another one where only a subset of these must be served (partial-grid). For the full-grid scenario we devise optimal, approximation, and heuristic algorithms, while for the partial-grid scenario we devise optimal and heuristic algorithms. Eventually, we comprehensively evaluate our algorithms on generated synthetic and quasi-real data. △ Less

Submitted 27 February, 2023; originally announced February 2023.

Showing 1–50 of 206 results for author: Das, K