Skip to main content

Showing 1–50 of 240 results for author: Joshi, S

  1. arXiv:2407.11141  [pdf, other

    cs.CV

    UFQA: Utility guided Fingerphoto Quality Assessment

    Authors: Amol S. Joshi, Ali Dabouei, Jeremy Dawson, Nasser Nasrabadi

    Abstract: Quality assessment of fingerprints captured using digital cameras and smartphones, also called fingerphotos, is a challenging problem in biometric recognition systems. As contactless biometric modalities are gaining more attention, their reliability should also be improved. Many factors, such as illumination, image contrast, camera angle, etc., in fingerphoto acquisition introduce various types of… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  2. arXiv:2407.09481  [pdf

    cs.CY cs.HC

    ChatGPT and Vaccine Hesitancy: A Comparison of English, Spanish, and French Responses Using a Validated Scale

    Authors: Saubhagya Joshi, Eunbin Ha, Yonaira Rivera, Vivek K. Singh

    Abstract: ChatGPT is a popular information system (over 1 billion visits in August 2023) that can generate natural language responses to user queries. It is important to study the quality and equity of its responses on health-related topics, such as vaccination, as they may influence public health decision-making. We use the Vaccine Hesitancy Scale (VHS) proposed by Shapiro et al.1 to measure the hesitancy… ▽ More

    Submitted 6 May, 2024; originally announced July 2024.

    Comments: 11 pages. Appeared in the Proceedings of the AMIA Informatics Summit, 2024

  3. arXiv:2407.02968  [pdf, other

    cs.CV cs.AI cs.CC cs.ET

    Unified Anomaly Detection methods on Edge Device using Knowledge Distillation and Quantization

    Authors: Sushovan Jena, Arya Pulkit, Kajal Singh, Anoushka Banerjee, Sharad Joshi, Ananth Ganesh, Dinesh Singh, Arnav Bhavsar

    Abstract: With the rapid advances in deep learning and smart manufacturing in Industry 4.0, there is an imperative for high-throughput, high-performance, and fully integrated visual inspection systems. Most anomaly detection approaches using defect detection datasets, such as MVTec AD, employ one-class models that require fitting separate models for each class. On the contrary, unified models eliminate the… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: 20 pages

    MSC Class: 68T07 ACM Class: I.2.10

  4. arXiv:2407.00121  [pdf, other

    cs.LG cs.AI cs.CL

    Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks

    Authors: Ibrahim Abdelaziz, Kinjal Basu, Mayank Agarwal, Sadhana Kumaravel, Matthew Stallone, Rameswar Panda, Yara Rizk, GP Bhargav, Maxwell Crouse, Chulaka Gunasekara, Shajith Ikbal, Sachin Joshi, Hima Karanam, Vineet Kumar, Asim Munawar, Sumit Neelam, Dinesh Raghu, Udit Sharma, Adriana Meza Soria, Dheeraj Sreedhar, Praveen Venkateswaran, Merve Unuvar, David Cox, Salim Roukos, Luis Lastras , et al. (1 additional authors not shown)

    Abstract: Large language models (LLMs) have recently shown tremendous promise in serving as the backbone to agentic systems, as demonstrated by their performance in multi-faceted, challenging benchmarks like SWE-Bench and Agent-Bench. However, to realize the true potential of LLMs as autonomous agents, they must learn to identify, call, and interact with external tools and application program interfaces (AP… ▽ More

    Submitted 27 June, 2024; originally announced July 2024.

  5. arXiv:2406.13844  [pdf, other

    cs.CV cs.AI cs.DB

    MAMA-MIA: A Large-Scale Multi-Center Breast Cancer DCE-MRI Benchmark Dataset with Expert Segmentations

    Authors: Lidia Garrucho, Claire-Anne Reidel, Kaisar Kushibar, Smriti Joshi, Richard Osuala, Apostolia Tsirikoglou, Maciej Bobowicz, Javier del Riego, Alessandro Catanese, Katarzyna Gwoździewicz, Maria-Laura Cosaka, Pasant M. Abo-Elhoda, Sara W. Tantawy, Shorouq S. Sakrana, Norhan O. Shawky-Abdelfatah, Amr Muhammad Abdo-Salem, Androniki Kozana, Eugen Divjak, Gordana Ivanac, Katerina Nikiforaki, Michail E. Klontzas, Rosa García-Dosdá, Meltem Gulsun-Akpinar, Oğuz Lafcı, Ritse Mann , et al. (8 additional authors not shown)

    Abstract: Current research in breast cancer Magnetic Resonance Imaging (MRI), especially with Artificial Intelligence (AI), faces challenges due to the lack of expert segmentations. To address this, we introduce the MAMA-MIA dataset, comprising 1506 multi-center dynamic contrast-enhanced MRI cases with expert segmentations of primary tumors and non-mass enhancement areas. These cases were sourced from four… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 15 paes, 7 figures, 3 tables

  6. arXiv:2406.08848  [pdf, other

    cs.CL cs.AI

    An Approach to Build Zero-Shot Slot-Filling System for Industry-Grade Conversational Assistants

    Authors: G P Shrivatsa Bhargav, Sumit Neelam, Udit Sharma, Shajith Ikbal, Dheeraj Sreedhar, Hima Karanam, Sachindra Joshi, Pankaj Dhoolia, Dinesh Garg, Kyle Croutwater, Haode Qi, Eric Wayne, J William Murdock

    Abstract: We present an approach to build Large Language Model (LLM) based slot-filling system to perform Dialogue State Tracking in conversational assistants serving across a wide variety of industry-grade applications. Key requirements of this system include: 1) usage of smaller-sized models to meet low latency requirements and to enable convenient and cost-effective cloud and customer premise deployments… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  7. arXiv:2406.05120  [pdf, other

    cs.CV

    Contextual fusion enhances robustness to image blurring

    Authors: Shruti Joshi, Aiswarya Akumalla, Seth Haney, Maxim Bazhenov

    Abstract: Mammalian brains handle complex reasoning by integrating information across brain regions specialized for particular sensory modalities. This enables improved robustness and generalization versus deep neural networks, which typically process one modality and are vulnerable to perturbations. While defense methods exist, they do not generalize well across perturbations. We developed a fusion model c… ▽ More

    Submitted 7 June, 2024; originally announced June 2024.

    Comments: arXiv admin note: text overlap with arXiv:2011.09526

  8. arXiv:2405.14030  [pdf, other

    cs.CV cs.CL

    Refining Skewed Perceptions in Vision-Language Models through Visual Representations

    Authors: Haocheng Dai, Sarang Joshi

    Abstract: Large vision-language models (VLMs), such as CLIP, have become foundational, demonstrating remarkable success across a variety of downstream tasks. Despite their advantages, these models, akin to other foundational systems, inherit biases from the disproportionate distribution of real-world data, leading to misconceptions about the actual environment. Prevalent datasets like ImageNet are often rid… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

    Comments: 18 pages, 7 figures

  9. arXiv:2405.06467  [pdf, other

    cs.CV

    Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection

    Authors: Sushovan Jena, Vishwas Saini, Ujjwal Shaw, Pavitra Jain, Abhay Singh Raihal, Anoushka Banerjee, Sharad Joshi, Ananth Ganesh, Arnav Bhavsar

    Abstract: Unsupervised anomaly detection encompasses diverse applications in industrial settings where a high-throughput and precision is imperative. Early works were centered around one-class-one-model paradigm, which poses significant challenges in large-scale production environments. Knowledge-distillation based multi-class anomaly detection promises a low latency with a reasonably good performance but w… ▽ More

    Submitted 10 May, 2024; originally announced May 2024.

    Comments: 15 pages

    MSC Class: 68T07 ACM Class: I.2.10

  10. arXiv:2405.01852  [pdf

    cs.DC cs.CR cs.ET

    Tokenization of Real Estate Assets Using Blockchain

    Authors: Shashank Joshi, Arhan Choudhury

    Abstract: Blockchain technology is one of the key technologies that have revolutionized various facets of society, such as the banking, healthcare, and other critical ecosystems. One area that can harness the usage of blockchain is the real estate sector. The most lucrative long-term investment is real estate, followed by gold, equities, mutual funds, and savings accounts. Nevertheless, it has administrativ… ▽ More

    Submitted 3 May, 2024; originally announced May 2024.

    Journal ref: IJIIT vol.18, no.3 2022: pp.1-12.

  11. arXiv:2404.19291  [pdf, other

    cs.HC

    Dynamic Human Trust Modeling of Autonomous Agents With Varying Capability and Strategy

    Authors: Jason Dekarske, Zhaodan Kong, Sanjay Joshi

    Abstract: Objective We model the dynamic trust of human subjects in a human-autonomy-teaming screen-based task. Background Trust is an emerging area of study in human-robot collaboration. Many studies have looked at the issue of robot performance as a sole predictor of human trust, but this could underestimate the complexity of the interaction. Method Subjects were paired with autonomous agents to searc… ▽ More

    Submitted 30 April, 2024; originally announced April 2024.

  12. arXiv:2404.16632  [pdf

    cs.CR cs.SE

    Introducing Systems Thinking as a Framework for Teaching and Assessing Threat Modeling Competency

    Authors: Siddhant S. Joshi, Preeti Mukherjee, Kirsten A. Davis, James C. Davis

    Abstract: Computing systems face diverse and substantial cybersecurity threats. To mitigate these cybersecurity threats, software engineers need to be competent in the skill of threat modeling. In industry and academia, there are many frameworks for teaching threat modeling, but our analysis of these frameworks suggests that (1) these approaches tend to be focused on component-level analysis rather than edu… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Presented at the Annual Conference of the American Society for Engineering Education (ASEE'24) 2024

  13. arXiv:2404.05807  [pdf, other

    cs.NE

    Slax: A Composable JAX Library for Rapid and Flexible Prototyping of Spiking Neural Networks

    Authors: Thomas M. Summe, Siddharth Joshi

    Abstract: Recent advances to algorithms for training spiking neural networks (SNNs) often leverage their unique dynamics. While backpropagation through time (BPTT) with surrogate gradients dominate the field, a rich landscape of alternatives can situate algorithms across various points in the performance, bio-plausibility, and complexity landscape. Evaluating and comparing algorithms is currently a cumberso… ▽ More

    Submitted 8 April, 2024; originally announced April 2024.

    Comments: 13 pages, 11 figures, early draft

  14. arXiv:2403.18679  [pdf

    cs.SE cs.HC

    An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project

    Authors: Ben Arie Tanay, Lexy Arinze, Siddhant S. Joshi, Kirsten A. Davis, James C. Davis

    Abstract: Background: Large Language Models (LLMs) such as ChatGPT and CoPilot are influencing software engineering practice. Software engineering educators must teach future software engineers how to use such tools well. As of yet, there have been few studies that report on the use of LLMs in the classroom. It is, therefore, important to evaluate students' perception of LLMs and possible ways of adapting t… ▽ More

    Submitted 16 April, 2024; v1 submitted 27 March, 2024; originally announced March 2024.

    Comments: Accepted to the 2024 General Conference of the American Society for Engineering Education (ASEE)

  15. arXiv:2403.17426  [pdf, other

    cs.AI

    Knowledge-Powered Recommendation for an Improved Diet Water Footprint

    Authors: Saurav Joshi, Filip Ilievski, Jay Pujara

    Abstract: According to WWF, 1.1 billion people lack access to water, and 2.7 billion experience water scarcity at least one month a year. By 2025, two-thirds of the world's population may be facing water shortages. This highlights the urgency of managing water usage efficiently, especially in water-intensive sectors like food. This paper proposes a recommendation engine, powered by knowledge graphs, aiming… ▽ More

    Submitted 26 March, 2024; originally announced March 2024.

    Comments: 3 pages, 1 figure, AAAI'24

  16. arXiv:2403.13890  [pdf, other

    eess.IV cs.AI cs.CV cs.LG

    Towards Learning Contrast Kinetics with Multi-Condition Latent Diffusion Models

    Authors: Richard Osuala, Daniel M. Lang, Preeti Verma, Smriti Joshi, Apostolia Tsirikoglou, Grzegorz Skorupko, Kaisar Kushibar, Lidia Garrucho, Walter H. L. Pinaya, Oliver Diaz, Julia A. Schnabel, Karim Lekadir

    Abstract: Contrast agents in dynamic contrast enhanced magnetic resonance imaging allow to localize tumors and observe their contrast kinetics, which is essential for cancer characterization and respective treatment decision-making. However, contrast agent administration is not only associated with adverse health risks, but also restricted for patients during pregnancy, and for those with kidney malfunction… ▽ More

    Submitted 17 July, 2024; v1 submitted 20 March, 2024; originally announced March 2024.

    Comments: Early Accept at MICCAI2024

  17. arXiv:2403.12267  [pdf, other

    cs.CV cs.LG

    Data-Efficient Contrastive Language-Image Pretraining: Prioritizing Data Quality over Quantity

    Authors: Siddharth Joshi, Arnav Jain, Ali Payani, Baharan Mirzasoleiman

    Abstract: Contrastive Language-Image Pre-training (CLIP) on large-scale image-caption datasets learns representations that can achieve remarkable zero-shot generalization. However, such models require a massive amount of pre-training data. Improving the quality of the pre-training data has been shown to be much more effective in improving CLIP's performance than increasing its volume. Nevertheless, finding… ▽ More

    Submitted 19 March, 2024; v1 submitted 18 March, 2024; originally announced March 2024.

    Comments: AISTATS 2024, Code: https://github.com/BigML-CS-UCLA/clipcov-data-efficient-clip

  18. arXiv:2403.11391  [pdf, other

    cs.LG cs.CV

    Investigating the Benefits of Projection Head for Representation Learning

    Authors: Yihao Xue, Eric Gan, Jiayi Ni, Siddharth Joshi, Baharan Mirzasoleiman

    Abstract: An effective technique for obtaining high-quality representations is adding a projection head on top of the encoder during training, then discarding it and using the pre-projection representations. Despite its proven practical effectiveness, the reason behind the success of this technique is poorly understood. The pre-projection representations are not directly optimized by the loss function, rais… ▽ More

    Submitted 17 March, 2024; originally announced March 2024.

    Journal ref: ICLR 2024

  19. arXiv:2403.04890  [pdf, other

    cs.CL

    Few shot chain-of-thought driven reasoning to prompt LLMs for open ended medical question answering

    Authors: Ojas Gramopadhye, Saeel Sandeep Nachane, Prateek Chanda, Ganesh Ramakrishnan, Kshitij Sharad Jadhav, Yatin Nandwani, Dinesh Raghu, Sachindra Joshi

    Abstract: Large Language models (LLMs) have demonstrated significant potential in transforming healthcare by automating tasks such as clinical documentation, information retrieval, and decision support. In this aspect, carefully engineered prompts have emerged as a powerful tool for using LLMs for medical scenarios, e.g., patient clinical scenarios. In this paper, we propose a modified version of the MedQA-… ▽ More

    Submitted 7 March, 2024; originally announced March 2024.

  20. arXiv:2403.02054  [pdf, other

    cs.AI

    Large Language Model-Based Evolutionary Optimizer: Reasoning with elitism

    Authors: Shuvayan Brahmachary, Subodh M. Joshi, Aniruddha Panda, Kaushik Koneripalli, Arun Kumar Sagotra, Harshil Patel, Ankush Sharma, Ameya D. Jagtap, Kaushic Kalyanaraman

    Abstract: Large Language Models (LLMs) have demonstrated remarkable reasoning abilities, prompting interest in their application as black-box optimizers. This paper asserts that LLMs possess the capability for zero-shot optimization across diverse scenarios, including multi-objective and high-dimensional problems. We introduce a novel population-based method for numerical optimization using LLMs called Lang… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  21. arXiv:2403.01926  [pdf, other

    cs.CL

    IndicVoices: Towards building an Inclusive Multilingual Speech Dataset for Indian Languages

    Authors: Tahir Javed, Janki Atul Nawale, Eldho Ittan George, Sakshi Joshi, Kaushal Santosh Bhogale, Deovrat Mehendale, Ishvinder Virender Sethi, Aparna Ananthanarayanan, Hafsah Faquih, Pratiti Palit, Sneha Ravishankar, Saranya Sukumaran, Tripura Panchagnula, Sunjay Murali, Kunal Sharad Gandhi, Ambujavalli R, Manickam K M, C Venkata Vaijayanthi, Krishnan Srinivasa Raghavan Karunganni, Pratyush Kumar, Mitesh M Khapra

    Abstract: We present INDICVOICES, a dataset of natural and spontaneous speech containing a total of 7348 hours of read (9%), extempore (74%) and conversational (17%) audio from 16237 speakers covering 145 Indian districts and 22 languages. Of these 7348 hours, 1639 hours have already been transcribed, with a median of 73 hours per language. Through this paper, we share our journey of capturing the cultural,… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

  22. arXiv:2402.19355  [pdf, other

    cs.SD cs.CR cs.LG eess.AS

    Unraveling Adversarial Examples against Speaker Identification -- Techniques for Attack Detection and Victim Model Classification

    Authors: Sonal Joshi, Thomas Thebaud, Jesús Villalba, Najim Dehak

    Abstract: Adversarial examples have proven to threaten speaker identification systems, and several countermeasures against them have been proposed. In this paper, we propose a method to detect the presence of adversarial examples, i.e., a binary classifier distinguishing between benign and adversarial examples. We build upon and extend previous work on attack type classification by exploring new architectur… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

  23. arXiv:2402.19109  [pdf, other

    stat.ME cs.IT

    Confidence and Assurance of Percentiles

    Authors: Sanjay M. Joshi

    Abstract: Confidence interval of mean is often used when quoting statistics. The same rigor is often missing when quoting percentiles and tolerance or percentile intervals. This article derives the expression for confidence in percentiles of a sample population. Confidence intervals of median is compared to those of mean for a few sample distributions. The concept of assurance from reliability engineering i… ▽ More

    Submitted 29 February, 2024; originally announced February 2024.

    Comments: 5 pages, 4 Figures

  24. arXiv:2402.07658  [pdf, other

    cs.CL cs.SD eess.AS

    The Sound of Healthcare: Improving Medical Transcription ASR Accuracy with Large Language Models

    Authors: Ayo Adedeji, Sarita Joshi, Brendan Doohan

    Abstract: In the rapidly evolving landscape of medical documentation, transcribing clinical dialogues accurately is increasingly paramount. This study explores the potential of Large Language Models (LLMs) to enhance the accuracy of Automatic Speech Recognition (ASR) systems in medical transcription. Utilizing the PriMock57 dataset, which encompasses a diverse range of primary care consultations, we apply a… ▽ More

    Submitted 12 February, 2024; originally announced February 2024.

    Comments: 31 pages, 17 figures

  25. arXiv:2402.06185  [pdf, other

    cs.CV cs.AI cs.LG

    Development and validation of an artificial intelligence model to accurately predict spinopelvic parameters

    Authors: Edward S. Harake, Joseph R. Linzey, Cheng Jiang, Rushikesh S. Joshi, Mark M. Zaki, Jaes C. Jones, Siri S. Khalsa, John H. Lee, Zachary Wilseck, Jacob R. Joseph, Todd C. Hollon, Paul Park

    Abstract: Objective. Achieving appropriate spinopelvic alignment has been shown to be associated with improved clinical symptoms. However, measurement of spinopelvic radiographic parameters is time-intensive and interobserver reliability is a concern. Automated measurement tools have the promise of rapid and consistent measurements, but existing tools are still limited by some degree of manual user-entry re… ▽ More

    Submitted 8 February, 2024; originally announced February 2024.

    Comments: 10 pages, 5 figures, to appear in Journal of Neurosurgery: Spine

  26. arXiv:2402.02479  [pdf, other

    cs.LG cs.AI cs.CL cs.HC

    BRAIn: Bayesian Reward-conditioned Amortized Inference for natural language generation from feedback

    Authors: Gaurav Pandey, Yatin Nandwani, Tahira Naseem, Mayank Mishra, Guangxuan Xu, Dinesh Raghu, Sachindra Joshi, Asim Munawar, Ramón Fernandez Astudillo

    Abstract: Distribution matching methods for language model alignment such as Generation with Distributional Control (GDC) and Distributional Policy Gradient (DPG) have not received the same level of attention in reinforcement learning from human feedback (RLHF) as contrastive methods such as Sequence Likelihood Calibration (SLiC), Direct Preference Optimization (DPO) and its variants. We identify high varia… ▽ More

    Submitted 10 June, 2024; v1 submitted 4 February, 2024; originally announced February 2024.

    Comments: Accepted at ICML 2024 (main conference)

  27. Exploring the Need of Accessibility Education in the Software Industry: Insights from a Survey of Software Professionals in India

    Authors: Parthasarathy P D, Swaroop Joshi

    Abstract: A UserWay study in 2021 indicates that an annual global e-commerce revenue loss of approximately $16 billion can be attributed to inaccessible websites and applications. According to the 2023 WebAIM study, only 3.7% of the world's top one million website homepages are fully accessible. This shows that many software developers use poor coding practices that don't adhere to the Web Content Accessibi… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: To be published in International Conference on Software Engineering (ICSE'24), Software Engineering Education and Training Track

    ACM Class: K.3.2; K.4.2

  28. Teaching Digital Accessibility to Industry Professionals using the Community of Practice Framework: An Experience Report

    Authors: Parthasarathy PD, Swaroop Joshi

    Abstract: Despite recent initiatives aimed at improving accessibility, the field of digital accessibility remains markedly behind contemporary advancements in the software industry as a large number of real world software and web applications continue to fall short of accessibility requirements. A persisting skills deficit within the existing technology workforce has been an enduring impediment, hindering o… ▽ More

    Submitted 31 December, 2023; originally announced January 2024.

    Comments: To be published in International Conference on Software Engineering (ICSE'24), Software Engineering Education and Training Track

    ACM Class: K.3.2; K.4.2

  29. arXiv:2312.07418  [pdf

    cs.CV

    Attention Based Encoder Decoder Model for Video Captioning in Nepali (2023)

    Authors: Kabita Parajuli, Shashidhar Ram Joshi

    Abstract: Video captioning in Nepali, a language written in the Devanagari script, presents a unique challenge due to the lack of existing academic work in this domain. This work develops a novel encoder-decoder paradigm for Nepali video captioning to tackle this difficulty. LSTM and GRU sequence-to-sequence models are used in the model to produce related textual descriptions based on features retrieved fro… ▽ More

    Submitted 19 May, 2024; v1 submitted 12 December, 2023; originally announced December 2023.

  30. arXiv:2311.16151  [pdf, other

    cs.NE cs.LG

    Estimating Post-Synaptic Effects for Online Training of Feed-Forward SNNs

    Authors: Thomas Summe, Clemens JS Schaefer, Siddharth Joshi

    Abstract: Facilitating online learning in spiking neural networks (SNNs) is a key step in developing event-based models that can adapt to changing environments and learn from continuous data streams in real-time. Although forward-mode differentiation enables online learning, its computational requirements restrict scalability. This is typically addressed through approximations that limit learning in deep mo… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

  31. arXiv:2311.14335  [pdf, other

    cs.LG cs.AI

    Comparative Analysis of Transformers for Modeling Tabular Data: A Casestudy using Industry Scale Dataset

    Authors: Usneek Singh, Piyush Arora, Shamika Ganesan, Mohit Kumar, Siddhant Kulkarni, Salil R. Joshi

    Abstract: We perform a comparative analysis of transformer-based models designed for modeling tabular data, specifically on an industry-scale dataset. While earlier studies demonstrated promising outcomes on smaller public or synthetic datasets, the effectiveness did not extend to larger industry-scale datasets. The challenges identified include handling high-dimensional data, the necessity for efficient pr… ▽ More

    Submitted 24 November, 2023; originally announced November 2023.

    Comments: Accepted at 7th Joint International Conference on Data Science & Management of Data (11th ACMIKDD CODS and 29th COMAD)

  32. arXiv:2311.12944  [pdf, other

    cs.NI cs.AI cs.LG cs.NE

    SkyCharge: Deploying Unmanned Aerial Vehicles for Dynamic Load Optimization in Solar Small Cell 5G Networks

    Authors: Daksh Dave, Vinay Chamola, Sandeep Joshi, Sherali Zeadally

    Abstract: The power requirements posed by the fifth-generation and beyond cellular networks are an important constraint in network deployment and require energy-efficient solutions. In this work, we propose a novel user load transfer approach using airborne base stations (BS) mounted on drones for reliable and secure power redistribution across the micro-grid network comprising green small cell BSs. Dependi… ▽ More

    Submitted 9 February, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  33. arXiv:2311.12235  [pdf, other

    cs.AR cs.LG cs.NE

    Improvements in Interlayer Pipelining of CNN Accelerators Using Genetic Algorithms

    Authors: Mark Horeni, Siddharth Joshi

    Abstract: Deploying Convolutional Neural Networks (CNNs) on edge platforms necessitates efficient hardware acceleration. Any unnecessary data movement in such accelerators can unacceptably degrade performance and efficiency. To address this, we develop a layer fusion technique targeting CNNs, that reduces off-chip data communication using a Genetic Algorithm (GA) applied to graph-based topological sort. Res… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

  34. arXiv:2311.11157  [pdf, other

    cs.SI cs.AI cs.IR

    Contextualizing Internet Memes Across Social Media Platforms

    Authors: Saurav Joshi, Filip Ilievski, Luca Luceri

    Abstract: Internet memes have emerged as a novel format for communication and expressing ideas on the web. Their fluidity and creative nature are reflected in their widespread use, often across platforms and occasionally for unethical or harmful purposes. While computational work has already analyzed their high-level virality over time and developed specialized classifiers for hate speech detection, there h… ▽ More

    Submitted 26 February, 2024; v1 submitted 18 November, 2023; originally announced November 2023.

    Comments: 10 pages, 7 figures, 2 tables

  35. arXiv:2311.10899  [pdf, other

    cs.CV cs.CL cs.LG

    Extraction and Summarization of Explicit Video Content using Multi-Modal Deep Learning

    Authors: Shaunak Joshi, Raghav Gaggar

    Abstract: With the increase in video-sharing platforms across the internet, it is difficult for humans to moderate the data for explicit content. Hence, an automated pipeline to scan through video data for explicit content has become the need of the hour. We propose a novel pipeline that uses multi-modal deep learning to first extract the explicit segments of input videos and then summarize their content us… ▽ More

    Submitted 20 November, 2023; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: 8 pages, 3 figures

    ACM Class: I.2.10

  36. arXiv:2311.10879  [pdf, other

    eess.IV cs.CV cs.LG

    Pre- to Post-Contrast Breast MRI Synthesis for Enhanced Tumour Segmentation

    Authors: Richard Osuala, Smriti Joshi, Apostolia Tsirikoglou, Lidia Garrucho, Walter H. L. Pinaya, Oliver Diaz, Karim Lekadir

    Abstract: Despite its benefits for tumour detection and treatment, the administration of contrast agents in dynamic contrast-enhanced MRI (DCE-MRI) is associated with a range of issues, including their invasiveness, bioaccumulation, and a risk of nephrogenic systemic fibrosis. This study explores the feasibility of producing synthetic contrast enhancements by translating pre-contrast T1-weighted fat-saturat… ▽ More

    Submitted 31 May, 2024; v1 submitted 17 November, 2023; originally announced November 2023.

    Comments: Accepted as oral presentation at SPIE Medical Imaging 2024 (Image Processing)

  37. arXiv:2311.08705  [pdf, other

    cs.CL

    Evaluating Robustness of Dialogue Summarization Models in the Presence of Naturally Occurring Variations

    Authors: Ankita Gupta, Chulaka Gunasekara, Hui Wan, Jatin Ganhotra, Sachindra Joshi, Marina Danilevsky

    Abstract: Dialogue summarization task involves summarizing long conversations while preserving the most salient information. Real-life dialogues often involve naturally occurring variations (e.g., repetitions, hesitations) and existing dialogue summarization models suffer from performance drop on such conversations. In this study, we systematically investigate the impact of such variations on state-of-the-a… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

  38. arXiv:2311.07693  [pdf, other

    cs.LG

    Matching aggregate posteriors in the variational autoencoder

    Authors: Surojit Saha, Sarang Joshi, Ross Whitaker

    Abstract: The variational autoencoder (VAE) is a well-studied, deep, latent-variable model (DLVM) that efficiently optimizes the variational lower bound of the log marginal data likelihood and has a strong theoretical foundation. However, the VAE's known failure to match the aggregate posterior often results in \emph{pockets/holes} in the latent distribution (i.e., a failure to match the prior) and/or \emph… ▽ More

    Submitted 13 November, 2023; originally announced November 2023.

  39. arXiv:2311.06212  [pdf, other

    stat.ML cs.LG stat.AP

    Differentiable VQ-VAE's for Robust White Matter Streamline Encodings

    Authors: Andrew Lizarraga, Brandon Taraku, Edouardo Honig, Ying Nian Wu, Shantanu H. Joshi

    Abstract: Given the complex geometry of white matter streamlines, Autoencoders have been proposed as a dimension-reduction tool to simplify the analysis streamlines in a low-dimensional latent spaces. However, despite these recent successes, the majority of encoder architectures only perform dimension reduction on single streamlines as opposed to a full bundle of streamlines. This is a severe limitation of… ▽ More

    Submitted 18 November, 2023; v1 submitted 10 November, 2023; originally announced November 2023.

    Comments: 5 pages, 4 figures, 1 table

  40. arXiv:2310.19583  [pdf, other

    cs.CV cs.LG

    GC-MVSNet: Multi-View, Multi-Scale, Geometrically-Consistent Multi-View Stereo

    Authors: Vibhas K. Vats, Sripad Joshi, David J. Crandall, Md. Alimoor Reza, Soon-heung Jung

    Abstract: Traditional multi-view stereo (MVS) methods rely heavily on photometric and geometric consistency constraints, but newer machine learning-based MVS methods check geometric consistency across multiple source views only as a post-processing step. In this paper, we present a novel approach that explicitly encourages geometric consistency of reference view depth maps across multiple source views at di… ▽ More

    Submitted 21 December, 2023; v1 submitted 30 October, 2023; originally announced October 2023.

    Comments: Accepted in WACV 2024 Link: https://openaccess.thecvf.com/content/WACV2024/html/Vats_GC-MVSNet_Multi-View_Multi-Scale_Geometrically-Consistent_Multi-View_Stereo_WACV_2024_paper.html

    Journal ref: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024

  41. arXiv:2310.16370  [pdf, other

    cs.DC

    PartRePer-MPI: Combining Fault Tolerance and Performance for MPI Applications

    Authors: Sarthak Joshi, Sathish Vadhiyar

    Abstract: As we have entered Exascale computing, the faults in high-performance systems are expected to increase considerably. To compensate for a higher failure rate, the standard checkpoint/restart technique would need to create checkpoints at a much higher frequency resulting in an excessive amount of overhead which would not be sustainable for many scientific applications. Replication allows for fast re… ▽ More

    Submitted 25 October, 2023; originally announced October 2023.

  42. arXiv:2310.04971  [pdf, other

    cs.LG

    Understanding the Robustness of Multi-modal Contrastive Learning to Distribution Shift

    Authors: Yihao Xue, Siddharth Joshi, Dang Nguyen, Baharan Mirzasoleiman

    Abstract: Recently, multimodal contrastive learning (MMCL) approaches, such as CLIP, have achieved a remarkable success in learning representations that are robust against distribution shift and generalize to new domains. Despite the empirical success, the mechanism behind learning such generalizable representations is not understood. In this work, we rigorously analyze this problem and uncover two mechanis… ▽ More

    Submitted 17 March, 2024; v1 submitted 7 October, 2023; originally announced October 2023.

  43. arXiv:2310.03675  [pdf, other

    cs.LG

    Hadamard Domain Training with Integers for Class Incremental Quantized Learning

    Authors: Martin Schiemer, Clemens JS Schaefer, Jayden Parker Vap, Mark James Horeni, Yu Emma Wang, Juan Ye, Siddharth Joshi

    Abstract: Continual learning is a desirable feature in many modern machine learning applications, which allows in-field adaptation and updating, ranging from accommodating distribution shift, to fine-tuning, and to learning new tasks. For applications with privacy and low latency requirements, the compute and memory demands imposed by continual learning can be cost-prohibitive for resource-constraint edge p… ▽ More

    Submitted 5 October, 2023; originally announced October 2023.

  44. arXiv:2309.15734  [pdf, other

    cs.CV

    Synthetic Latent Fingerprint Generation Using Style Transfer

    Authors: Amol S. Joshi, Ali Dabouei, Nasser Nasrabadi, Jeremy Dawson

    Abstract: Limited data availability is a challenging problem in the latent fingerprint domain. Synthetically generated fingerprints are vital for training data-hungry neural network-based algorithms. Conventional methods distort clean fingerprints to generate synthetic latent fingerprints. We propose a simple and effective approach using style transfer and image blending to synthesize realistic latent finge… ▽ More

    Submitted 27 September, 2023; originally announced September 2023.

  45. arXiv:2309.12325  [pdf

    cs.CY cs.AI cs.CV cs.LG

    FUTURE-AI: International consensus guideline for trustworthy and deployable artificial intelligence in healthcare

    Authors: Karim Lekadir, Aasa Feragen, Abdul Joseph Fofanah, Alejandro F Frangi, Alena Buyx, Anais Emelie, Andrea Lara, Antonio R Porras, An-Wen Chan, Arcadi Navarro, Ben Glocker, Benard O Botwe, Bishesh Khanal, Brigit Beger, Carol C Wu, Celia Cintas, Curtis P Langlotz, Daniel Rueckert, Deogratias Mzurikwao, Dimitrios I Fotiadis, Doszhan Zhussupov, Enzo Ferrante, Erik Meijering, Eva Weicken, Fabio A González , et al. (95 additional authors not shown)

    Abstract: Despite major advances in artificial intelligence (AI) for medicine and healthcare, the deployment and adoption of AI technologies remain limited in real-world clinical practice. In recent years, concerns have been raised about the technical, clinical, ethical and legal risks associated with medical AI. To increase real world adoption, it is essential that medical AI tools are trusted and accepted… ▽ More

    Submitted 8 July, 2024; v1 submitted 11 August, 2023; originally announced September 2023.

    ACM Class: I.2.0; I.4.0; I.5.0

  46. arXiv:2309.05680  [pdf, other

    cs.HC cs.AI cs.SE

    Evaluating Chatbots to Promote Users' Trust -- Practices and Open Problems

    Authors: Biplav Srivastava, Kausik Lakkaraju, Tarmo Koppel, Vignesh Narayanan, Ashish Kundu, Sachindra Joshi

    Abstract: Chatbots, the common moniker for collaborative assistants, are Artificial Intelligence (AI) software that enables people to naturally interact with them to get tasks done. Although chatbots have been studied since the dawn of AI, they have particularly caught the imagination of the public and businesses since the launch of easy-to-use and general-purpose Large Language Model-based chatbots like Ch… ▽ More

    Submitted 13 September, 2023; v1 submitted 9 September, 2023; originally announced September 2023.

  47. arXiv:2306.15124  [pdf, other

    cs.SE cs.AI

    Identifying and Consolidating Knowledge Engineering Requirements

    Authors: Bradley P. Allen, Filip Ilievski, Saurav Joshi

    Abstract: Knowledge engineering is the process of creating and maintaining knowledge-producing systems. Throughout the history of computer science and AI, knowledge engineering workflows have been widely used because high-quality knowledge is assumed to be crucial for reliable intelligent agents. However, the landscape of knowledge engineering has changed, presenting four challenges: unaddressed stakeholder… ▽ More

    Submitted 26 June, 2023; originally announced June 2023.

  48. arXiv:2306.11957  [pdf, other

    cs.LG

    Towards Mitigating Spurious Correlations in the Wild: A Benchmark and a more Realistic Dataset

    Authors: Siddharth Joshi, Yu Yang, Yihao Xue, Wenhan Yang, Baharan Mirzasoleiman

    Abstract: Deep neural networks often exploit non-predictive features that are spuriously correlated with class labels, leading to poor performance on groups of examples without such features. Despite the growing body of recent works on remedying spurious correlations, the lack of a standardized benchmark hinders reproducible evaluation and comparison of the proposed solutions. To address this, we present Sp… ▽ More

    Submitted 29 September, 2023; v1 submitted 20 June, 2023; originally announced June 2023.

    Comments: Package: https://github.com/BigML-CS-UCLA/SpuCo

  49. arXiv:2306.04879  [pdf, other

    cs.LG

    Augmenting Hessians with Inter-Layer Dependencies for Mixed-Precision Post-Training Quantization

    Authors: Clemens JS Schaefer, Navid Lambert-Shirzad, Xiaofan Zhang, Chiachen Chou, Tom Jablin, Jian Li, Elfie Guo, Caitlin Stanton, Siddharth Joshi, Yu Emma Wang

    Abstract: Efficiently serving neural network models with low latency is becoming more challenging due to increasing model complexity and parameter count. Model quantization offers a solution which simultaneously reduces memory footprint and compute requirements. However, aggressive quantization may lead to an unacceptable loss in model accuracy owing to differences in sensitivity to numerical imperfection a… ▽ More

    Submitted 7 June, 2023; originally announced June 2023.

  50. arXiv:2306.00248  [pdf, other

    cs.IR cs.AI

    TransAct: Transformer-based Realtime User Action Model for Recommendation at Pinterest

    Authors: Xue Xia, Pong Eksombatchai, Nikil Pancha, Dhruvil Deven Badani, Po-Wei Wang, Neng Gu, Saurabh Vishwas Joshi, Nazanin Farahpour, Zhiyuan Zhang, Andrew Zhai

    Abstract: Sequential models that encode user activity for next action prediction have become a popular design choice for building web-scale personalized recommendation systems. Traditional methods of sequential recommendation either utilize end-to-end learning on realtime user actions, or learn user representations separately in an offline batch-generated manner. This paper (1) presents Pinterest's ranking… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

    Comments: \c{opyright} {ACM} {2023}. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in KDD'23, http://dx.doi.org/10.1145/3580305.3599918