Skip to main content

Showing 1–50 of 1,056 results for author: Kumar, S

  1. arXiv:2407.13318  [pdf, other

    cs.CR

    A new approach to delegate signing rights to proxy signers using isogeny-based cryptography

    Authors: Kunal Dey, Somnath Kumar, Vikas Srivastava, Sumit Kumar Debnath

    Abstract: E-governance is a two-way protocol through which one can use government services, share data and request information. It refers to the use of communication and information technologies to provide government services to public in an efficient and fast manner. In addition, any document submitted to the e-Government system must be authenticated by a government officer using a digital signature scheme… ▽ More

    Submitted 18 July, 2024; originally announced July 2024.

  2. arXiv:2407.12185  [pdf, other

    cs.LG cs.AI

    Satisficing Exploration for Deep Reinforcement Learning

    Authors: Dilip Arumugam, Saurabh Kumar, Ramki Gummadi, Benjamin Van Roy

    Abstract: A default assumption in the design of reinforcement-learning algorithms is that a decision-making agent always explores to learn optimal behavior. In sufficiently complex environments that approach the vastness and scale of the real world, however, attaining optimal performance may in fact be an entirely intractable endeavor and an agent may seldom find itself in a position to complete the requisi… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Accepted to the Finding the Frame Workshop at RLC 2024

  3. arXiv:2407.12043  [pdf, other

    cs.CL cs.AI cs.HC

    The Art of Saying No: Contextual Noncompliance in Language Models

    Authors: Faeze Brahman, Sachin Kumar, Vidhisha Balachandran, Pradeep Dasigi, Valentina Pyatkin, Abhilasha Ravichander, Sarah Wiegreffe, Nouha Dziri, Khyathi Chandu, Jack Hessel, Yulia Tsvetkov, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi

    Abstract: Chat-based language models are designed to be helpful, yet they should not comply with every user request. While most existing work primarily focuses on refusal of "unsafe" queries, we posit that the scope of noncompliance should be broadened. We introduce a comprehensive taxonomy of contextual noncompliance describing when and how models should not comply with user requests. Our taxonomy spans a… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

  4. arXiv:2407.10837  [pdf, other

    eess.SY cs.RO math.DS

    Trajectory Tracking for Unmanned Aerial Vehicles in 3D Spaces under Motion Constraints

    Authors: Saurabh Kumar, Shashi Ranjan Kumar, Abhinav Sinha

    Abstract: This article presents a three-dimensional nonlinear trajectory tracking control strategy for unmanned aerial vehicles (UAVs) in the presence of spatial constraints. As opposed to many existing control strategies, which do not consider spatial constraints, the proposed strategy considers spatial constraints on each degree of freedom movement of the UAV. Such consideration makes the design appealing… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.

  5. arXiv:2407.09466  [pdf, other

    cs.RO cs.GR

    TRAVERSE: Traffic-Responsive Autonomous Vehicle Experience & Rare-event Simulation for Enhanced safety

    Authors: Sandeep Thalapanane, Sandip Sharan Senthil Kumar, Guru Nandhan Appiya Dilipkumar Peethambari, Sourang SriHari, Laura Zheng, Julio Poveda, Ming C. Lin

    Abstract: Data for training learning-enabled self-driving cars in the physical world are typically collected in a safe, normal environment. Such data distribution often engenders a strong bias towards safe driving, making self-driving cars unprepared when encountering adversarial scenarios like unexpected accidents. Due to a dearth of such adverse data that is unrealistic for drivers to collect, autonomous… ▽ More

    Submitted 12 July, 2024; originally announced July 2024.

  6. arXiv:2407.08818  [pdf

    cs.CL

    MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization

    Authors: Orevaoghene Ahia, Sachin Kumar, Hila Gonen, Valentin Hoffman, Tomasz Limisiewicz, Yulia Tsvetkov, Noah A. Smith

    Abstract: In multilingual settings, non-Latin scripts and low-resource languages are usually disadvantaged in terms of language models' utility, efficiency, and cost. Specifically, previous studies have reported multiple modeling biases that the current tokenization algorithms introduce to non-Latin script languages, the main one being over-segmentation. In this work, we propose MAGNET; multilingual adaptiv… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  7. arXiv:2407.08726  [pdf, other

    cs.CV

    Map It Anywhere (MIA): Empowering Bird's Eye View Mapping using Large-scale Public Data

    Authors: Cherie Ho, Jiaye Zou, Omar Alama, Sai Mitheran Jagadesh Kumar, Benjamin Chiang, Taneesh Gupta, Chen Wang, Nikhil Keetha, Katia Sycara, Sebastian Scherer

    Abstract: Top-down Bird's Eye View (BEV) maps are a popular representation for ground robot navigation due to their richness and flexibility for downstream tasks. While recent methods have shown promise for predicting BEV maps from First-Person View (FPV) images, their generalizability is limited to small regions captured by current autonomous vehicle-based datasets. In this context, we show that a more sca… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  8. arXiv:2407.08655  [pdf, other

    eess.IV cs.AI cs.LG physics.med-ph

    SPOCKMIP: Segmentation of Vessels in MRAs with Enhanced Continuity using Maximum Intensity Projection as Loss

    Authors: Chethan Radhakrishna, Karthikesh Varma Chintalapati, Sri Chandana Hudukula Ram Kumar, Raviteja Sutrave, Hendrik Mattern, Oliver Speck, Andreas Nürnberger, Soumick Chatterjee

    Abstract: Identification of vessel structures of different sizes in biomedical images is crucial in the diagnosis of many neurodegenerative diseases. However, the sparsity of good-quality annotations of such images makes the task of vessel segmentation challenging. Deep learning offers an efficient way to segment vessels of different sizes by learning their high-level feature representations and the spatial… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

  9. arXiv:2407.07786  [pdf, ps, other

    cs.HC cs.AI cs.CY

    The Human Factor in AI Red Teaming: Perspectives from Social and Collaborative Computing

    Authors: Alice Qian Zhang, Ryland Shaw, Jacy Reese Anthis, Ashlee Milton, Emily Tseng, Jina Suh, Lama Ahmad, Ram Shankar Siva Kumar, Julian Posada, Benjamin Shestakofsky, Sarah T. Roberts, Mary L. Gray

    Abstract: Rapid progress in general-purpose AI has sparked significant interest in "red teaming," a practice of adversarial testing originating in military and cybersecurity applications. AI red teaming raises many questions about the human factor, such as how red teamers are selected, biases and blindspots in how tests are conducted, and harmful content's psychological effects on red teamers. A growing bod… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: Workshop proposal accepted to CSCW 2024

  10. arXiv:2407.07128  [pdf, other

    cs.LG cs.SI stat.ML

    Modularity aided consistent attributed graph clustering via coarsening

    Authors: Samarth Bhatia, Yukti Makhija, Manoj Kumar, Sandeep Kumar

    Abstract: Graph clustering is an important unsupervised learning technique for partitioning graphs with attributes and detecting communities. However, current methods struggle to accurately capture true community structures and intra-cluster relations, be computationally efficient, and identify smaller communities. We address these challenges by integrating coarsening and modularity maximization, effectivel… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: The first two authors contributed equally to this work

  11. arXiv:2407.06868  [pdf, other

    cs.IT cs.LG eess.SP

    Energy Efficient Fair STAR-RIS for Mobile Users

    Authors: Ashok S. Kumar, Nancy Nayak, Sheetal Kalyani, Himal A. Suraweera

    Abstract: In this work, we propose a method to improve the energy efficiency and fairness of simultaneously transmitting and reflecting reconfigurable intelligent surfaces (STAR-RIS) for mobile users, ensuring reduced power consumption while maintaining reliable communication. To achieve this, we introduce a new parameter known as the subsurface assignment variable, which determines the number of STAR-RIS e… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

  12. arXiv:2407.04444  [pdf, other

    cs.CL cs.SD eess.AS

    TokenVerse: Unifying Speech and NLP Tasks via Transducer-based ASR

    Authors: Shashi Kumar, Srikanth Madikeri, Juan Zuluaga-Gomez, Iuliia Nigmatulina, Esaú Villatoro-Tello, Sergio Burdisso, Petr Motlicek, Karthik Pandia, Aravind Ganapathiraju

    Abstract: In traditional conversational intelligence from speech, a cascaded pipeline is used, involving tasks such as voice activity detection, diarization, transcription, and subsequent processing with different NLP models for tasks like semantic endpointing and named entity recognition (NER). Our paper introduces TokenVerse, a single Transducer-based model designed to handle multiple tasks. This is achie… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 5 pages, double column

  13. arXiv:2407.04332  [pdf

    cs.ET

    Energy Efficient Knapsack Optimization Using Probabilistic Memristor Crossbars

    Authors: Jinzhan Li, Suhas Kumar, Su-in Yi

    Abstract: Constrained optimization underlies crucial societal problems (for instance, stock trading and bandwidth allocation), but is often computationally hard (complexity grows exponentially with problem size). The big-data era urgently demands low-latency and low-energy optimization at the edge, which cannot be handled by digital processors due to their non-parallel von Neumann architecture. Recent effor… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: 16 pages, 8 figures

  14. arXiv:2407.04087  [pdf, other

    cs.NE cs.AI

    Advanced Artificial Intelligence Strategy for Optimizing Urban Rail Network Design using Nature-Inspired Algorithms

    Authors: Hariram Sampath Kumar, Archana Singh, Manish Kumar Ojha

    Abstract: This study introduces an innovative methodology for the planning of metro network routes within the urban environment of Chennai, Tamil Nadu, India. A comparative analysis of the modified Ant Colony Optimization (ACO) method (previously developed) with recent breakthroughs in nature-inspired algorithms demonstrates the modified ACO's superiority over modern techniques. By utilizing the modified AC… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: 10 pages, 17 figures

  15. arXiv:2407.04053  [pdf, other

    cs.DC

    Edge AI: A Taxonomy, Systematic Review and Future Directions

    Authors: Sukhpal Singh Gill, Muhammed Golec, Jianmin Hu, Minxian Xu, Junhui Du, Huaming Wu, Guneet Kaur Walia, Subramaniam Subramanian Murugesan, Babar Ali, Mohit Kumar, Kejiang Ye, Prabal Verma, Surendra Kumar, Felix Cuadrado, Steve Uhlig

    Abstract: Edge Artificial Intelligence (AI) incorporates a network of interconnected systems and devices that receive, cache, process, and analyse data in close communication with the location where the data is captured with AI technology. Recent advancements in AI efficiency, the widespread use of Internet of Things (IoT) devices, and the emergence of edge computing have unlocked the enormous scope of Edge… ▽ More

    Submitted 4 July, 2024; originally announced July 2024.

    Comments: Preprint Version, 18 Figures

  16. arXiv:2407.03152  [pdf, other

    cs.CV cs.LG

    Stereo Risk: A Continuous Modeling Approach to Stereo Matching

    Authors: Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Yao Yao, Luc Van Gool

    Abstract: We introduce Stereo Risk, a new deep-learning approach to solve the classical stereo-matching problem in computer vision. As it is well-known that stereo matching boils down to a per-pixel disparity estimation problem, the popular state-of-the-art stereo-matching approaches widely rely on regressing the scene disparity values, yet via discretization of scene disparity values. Such discretization o… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

    Comments: Accepted as an Oral Paper at ICML 2024. Draft info: 18 pages, 6 Figure, 16 Tables

  17. arXiv:2407.02929  [pdf, other

    cs.NI

    A Hybrid Reactive Routing Protocol for Decentralized UAV Networks

    Authors: Shivam Garg, Alexander Ihler, Elizabeth Serena Bentley, Sunil Kumar

    Abstract: Wireless networks consisting of low SWaP, FW-UAVs are used in many applications, such as monitoring, search and surveillance of inaccessible areas. A decentralized and autonomous approach ensures robustness to failures; the UAVs explore and sense within the area and forward their information, in a multihop manner, to nearby aerial gateway nodes. However, the unpredictable nature of the events, rel… ▽ More

    Submitted 3 July, 2024; originally announced July 2024.

  18. arXiv:2407.02662  [pdf, other

    cs.SI cs.CL cs.CY

    Supporters and Skeptics: LLM-based Analysis of Engagement with Mental Health (Mis)Information Content on Video-sharing Platforms

    Authors: Viet Cuong Nguyen, Mini Jain, Abhijat Chauhan, Heather Jaime Soled, Santiago Alvarez Lesmes, Zihang Li, Michael L. Birnbaum, Sunny X. Tang, Srijan Kumar, Munmun De Choudhury

    Abstract: Over one in five adults in the US lives with a mental illness. In the face of a shortage of mental health professionals and offline resources, online short-form video content has grown to serve as a crucial conduit for disseminating mental health help and resources. However, the ease of content creation and access also contributes to the spread of misinformation, posing risks to accurate diagnosis… ▽ More

    Submitted 2 July, 2024; originally announced July 2024.

    Comments: 12 pages, in submission to ICWSM

  19. arXiv:2406.18848  [pdf, other

    cs.LG

    Temporally Multi-Scale Sparse Self-Attention for Physical Activity Data Imputation

    Authors: Hui Wei, Maxwell A. Xu, Colin Samplawski, James M. Rehg, Santosh Kumar, Benjamin M. Marlin

    Abstract: Wearable sensors enable health researchers to continuously collect data pertaining to the physiological state of individuals in real-world settings. However, such data can be subject to extensive missingness due to a complex combination of factors. In this work, we study the problem of imputation of missing step count data, one of the most ubiquitous forms of wearable sensor data. We construct a n… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

    Comments: Accepted by Conference on Health, Inference, and Learning (CHIL) 2024

  20. arXiv:2406.18510  [pdf, other

    cs.CL

    WildTeaming at Scale: From In-the-Wild Jailbreaks to (Adversarially) Safer Language Models

    Authors: Liwei Jiang, Kavel Rao, Seungju Han, Allyson Ettinger, Faeze Brahman, Sachin Kumar, Niloofar Mireshghallah, Ximing Lu, Maarten Sap, Yejin Choi, Nouha Dziri

    Abstract: We introduce WildTeaming, an automatic LLM safety red-teaming framework that mines in-the-wild user-chatbot interactions to discover 5.7K unique clusters of novel jailbreak tactics, and then composes multiple tactics for systematic exploration of novel jailbreaks. Compared to prior work that performed red-teaming via recruited human workers, gradient-based optimization, or iterative revision with… ▽ More

    Submitted 26 June, 2024; originally announced June 2024.

  21. arXiv:2406.17968  [pdf, other

    cs.IR cs.AI cs.LG stat.ML

    Efficient Document Ranking with Learnable Late Interactions

    Authors: Ziwei Ji, Himanshu Jain, Andreas Veit, Sashank J. Reddi, Sadeep Jayasumana, Ankit Singh Rawat, Aditya Krishna Menon, Felix Yu, Sanjiv Kumar

    Abstract: Cross-Encoder (CE) and Dual-Encoder (DE) models are two fundamental approaches for query-document relevance in information retrieval. To predict relevance, CE models use joint query-document embeddings, while DE models maintain factorized query and document embeddings; usually, the former has higher quality while the latter benefits from lower latency. Recently, late-interaction models have been p… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

  22. arXiv:2406.17963  [pdf, other

    cs.LG cs.HC cs.SI

    Empowering Interdisciplinary Insights with Dynamic Graph Embedding Trajectories

    Authors: Yiqiao Jin, Andrew Zhao, Yeon-Chang Lee, Meng Ye, Ajay Divakaran, Srijan Kumar

    Abstract: We developed DyGETViz, a novel framework for effectively visualizing dynamic graphs (DGs) that are ubiquitous across diverse real-world systems. This framework leverages recent advancements in discrete-time dynamic graph (DTDG) models to adeptly handle the temporal dynamics inherent in dynamic graphs. DyGETViz effectively captures both micro- and macro-level structural shifts within these graphs,… ▽ More

    Submitted 28 June, 2024; v1 submitted 25 June, 2024; originally announced June 2024.

    Comments: 27 pages, 11 figures

  23. arXiv:2406.17641  [pdf

    cs.RO cs.HC

    The experience of humans' and robots' mutual (im)politeness in enacted service scenarios: An empirical study

    Authors: Victor Kaptelinin, Suna Bensch, Thomas Hellström, Patrik Björnfot, Shikhar Kumar

    Abstract: The paper reports an empirical study of the effect of human treatment of a robot on the social perception of the robot's behavior. The study employed an enacted interaction between an anthropomorphic "waiter" robot and two customers. The robot and one of the customers (acted out by a researcher) were following four different interaction scripts, representing all combinations of mutual politeness a… ▽ More

    Submitted 25 June, 2024; originally announced June 2024.

    Comments: 19 pages, 5 figures, 7 tables

  24. arXiv:2406.13715  [pdf, other

    cs.AI cs.IR

    Converging Dimensions: Information Extraction and Summarization through Multisource, Multimodal, and Multilingual Fusion

    Authors: Pranav Janjani, Mayank Palan, Sarvesh Shirude, Ninad Shegokar, Sunny Kumar, Faruk Kazi

    Abstract: Recent advances in large language models (LLMs) have led to new summarization strategies, offering an extensive toolkit for extracting important information. However, these approaches are frequently limited by their reliance on isolated sources of data. The amount of information that can be gathered is limited and covers a smaller range of themes, which introduces the possibility of falsified cont… ▽ More

    Submitted 19 June, 2024; originally announced June 2024.

    Comments: 11 pages, 3 figures

  25. arXiv:2406.12405  [pdf

    cs.IT cs.ET eess.SP

    On The Effective Rate and Error Rate Analysis over Fluctuating Nakagami-m Fading Channel

    Authors: Manpreet Kaur, Puspraj Singh Chauhan, Sandeep Kumar, Pappu Kumar Verma

    Abstract: This paper provides a detailed analysis of the important performance metrics like effective capacity and symbol error rate over fluctuating Nakagami-m fading channel. This distribution is obtained from the ratio of two random variables, following the Nakagami-m distribution and the uniform distribution. Our study derives exact analytical expressions for the EC and SER under different modulation sc… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 18 pages

  26. arXiv:2406.12274  [pdf, other

    cs.CL

    SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

    Authors: Somnath Banerjee, Soham Tripathy, Sayan Layek, Shanu Kumar, Animesh Mukherjee, Rima Hazra

    Abstract: Safety-aligned language models often exhibit fragile and imbalanced safety mechanisms, increasing the likelihood of generating unsafe content. In addition, incorporating new knowledge through editing techniques to language models can further compromise safety. To address these issues, we propose SafeInfer, a context-adaptive, decoding-time safety alignment strategy for generating safe responses to… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: Under review

  27. arXiv:2406.11768  [pdf, other

    cs.SD cs.AI cs.CL eess.AS

    GAMA: A Large Audio-Language Model with Advanced Audio Understanding and Complex Reasoning Abilities

    Authors: Sreyan Ghosh, Sonal Kumar, Ashish Seth, Chandra Kiran Reddy Evuru, Utkarsh Tyagi, S Sakshi, Oriol Nieto, Ramani Duraiswami, Dinesh Manocha

    Abstract: Perceiving and understanding non-speech sounds and non-verbal speech is essential to making decisions that help us interact with our surroundings. In this paper, we propose GAMA, a novel General-purpose Large Audio-Language Model (LALM) with Advanced Audio Understanding and Complex Reasoning Abilities. We build GAMA by integrating an LLM with multiple types of audio representations, including feat… ▽ More

    Submitted 17 June, 2024; originally announced June 2024.

    Comments: Project Website: https://sreyan88.github.io/gamaaudio/

  28. arXiv:2406.09443  [pdf, other

    eess.AS cs.HC cs.LG

    Comparative Analysis of Personalized Voice Activity Detection Systems: Assessing Real-World Effectiveness

    Authors: Satyam Kumar, Sai Srujana Buddi, Utkarsh Oggy Sarawgi, Vineet Garg, Shivesh Ranjan, Ognjen, Rudovic, Ahmed Hussen Abdelaziz, Saurabh Adya

    Abstract: Voice activity detection (VAD) is a critical component in various applications such as speech recognition, speech enhancement, and hands-free communication systems. With the increasing demand for personalized and context-aware technologies, the need for effective personalized VAD systems has become paramount. In this paper, we present a comparative analysis of Personalized Voice Activity Detection… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

  29. arXiv:2406.09167  [pdf, other

    cs.SD eess.AS

    Vision Transformer Segmentation for Visual Bird Sound Denoising

    Authors: Sahil Kumar, Jialu Li, Youshan Zhang

    Abstract: Audio denoising, especially in the context of bird sounds, remains a challenging task due to persistent residual noise. Traditional and deep learning methods often struggle with artificial or low-frequency noise. In this work, we propose ViTVS, a novel approach that leverages the power of the vision transformer (ViT) architecture. ViTVS adeptly combines segmentation techniques to disentangle clean… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

    Comments: INTERSPEECH 2024

  30. arXiv:2406.07910  [pdf, other

    cs.ET cs.NI eess.SP

    Demonstration of Safe Electromagnetic Radiation Emitted by 5G Active Antenna Systems

    Authors: Sumit Kumar, Chandan Kumar Sheemar, Abdelrahman Astro, Jorge Querol, Symeon Chatzinotas

    Abstract: The careful planning and safe deployment of 5G technologies will bring enormous benefits to society and the economy. Higher frequency, beamforming, and small-cells are key technologies that will provide unmatched throughput and seamless connectivity to 5G users. Superficial knowledge of these technologies has raised concerns among the general public about the harmful effects of radiation. Several… ▽ More

    Submitted 12 June, 2024; originally announced June 2024.

  31. arXiv:2406.06811  [pdf, other

    cs.LG

    Learning Continually by Spectral Regularization

    Authors: Alex Lewandowski, Saurabh Kumar, Dale Schuurmans, András György, Marlos C. Machado

    Abstract: Loss of plasticity is a phenomenon where neural networks become more difficult to train during the course of learning. Continual learning algorithms seek to mitigate this effect by sustaining good predictive performance while maintaining network trainability. We develop new techniques for improving continual learning by first reconsidering how initialization can ensure trainability during early ph… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

  32. arXiv:2406.04432  [pdf, other

    eess.AS cs.AI cs.CL

    LipGER: Visually-Conditioned Generative Error Correction for Robust Automatic Speech Recognition

    Authors: Sreyan Ghosh, Sonal Kumar, Ashish Seth, Purva Chiniya, Utkarsh Tyagi, Ramani Duraiswami, Dinesh Manocha

    Abstract: Visual cues, like lip motion, have been shown to improve the performance of Automatic Speech Recognition (ASR) systems in noisy environments. We propose LipGER (Lip Motion aided Generative Error Correction), a novel framework for leveraging visual cues for noise-robust ASR. Instead of learning the cross-modal correlation between the audio and visual modalities, we make an LLM learn the task of vis… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: InterSpeech 2024. Code and Data: https://github.com/Sreyan88/LipGER

  33. arXiv:2406.04286  [pdf, other

    cs.CL cs.AI

    ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions

    Authors: Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, C. K. Evuru, S Ramaneswaran, S Sakshi, Dinesh Manocha

    Abstract: We present ABEX, a novel and effective generative data augmentation methodology for low-resource Natural Language Understanding (NLU) tasks. ABEX is based on ABstract-and-EXpand, a novel paradigm for generating diverse forms of an input document -- we first convert a document into its concise, abstract description and then generate new documents based on expanding the resultant abstraction. To lea… ▽ More

    Submitted 6 June, 2024; originally announced June 2024.

    Comments: ACL 2024 Main Conference. Code and data: https://github.com/Sreyan88/ABEX

  34. arXiv:2406.02469  [pdf, other

    cs.LG cs.CL

    Landscape-Aware Growing: The Power of a Little LAG

    Authors: Stefani Karp, Nikunj Saunshi, Sobhan Miryoosefi, Sashank J. Reddi, Sanjiv Kumar

    Abstract: Recently, there has been increasing interest in efficient pretraining paradigms for training Transformer-based models. Several recent approaches use smaller models to initialize larger models in order to save computation (e.g., stacking and fusion). In this work, we study the fundamental question of how to select the best growing strategy from a given pool of growing strategies. Prior works have e… ▽ More

    Submitted 4 June, 2024; originally announced June 2024.

  35. arXiv:2405.19420  [pdf, other

    cs.LG cs.AI q-bio.NC

    Using Contrastive Learning with Generative Similarity to Learn Spaces that Capture Human Inductive Biases

    Authors: Raja Marjieh, Sreejan Kumar, Declan Campbell, Liyi Zhang, Gianluca Bencomo, Jake Snell, Thomas L. Griffiths

    Abstract: Humans rely on strong inductive biases to learn from few examples and abstract useful information from sensory data. Instilling such biases in machine learning models has been shown to improve their performance on various benchmarks including few-shot learning, robustness, and alignment. However, finding effective training procedures to achieve that goal can be challenging as psychologically-rich… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  36. arXiv:2405.19261  [pdf, other

    cs.CL cs.AI cs.LG

    Faster Cascades via Speculative Decoding

    Authors: Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat, Seungyeon Kim, Neha Gupta, Aditya Krishna Menon, Sanjiv Kumar

    Abstract: Cascades and speculative decoding are two common approaches to improving language models' inference efficiency. Both approaches involve interleaving models of different sizes, but via fundamentally distinct mechanisms: cascades employ a deferral rule that invokes the larger model only for "hard" inputs, while speculative decoding uses speculative execution to primarily invoke the larger model in p… ▽ More

    Submitted 29 May, 2024; originally announced May 2024.

  37. arXiv:2405.18359  [pdf, other

    cs.CL cs.AI cs.LG

    Bridging the Gap: Dynamic Learning Strategies for Improving Multilingual Performance in LLMs

    Authors: Somnath Kumar, Vaibhav Balloli, Mercy Ranjit, Kabir Ahuja, Tanuja Ganu, Sunayana Sitaram, Kalika Bali, Akshay Nambi

    Abstract: Large language models (LLMs) are at the forefront of transforming numerous domains globally. However, their inclusivity and effectiveness remain limited for non-Latin scripts and low-resource languages. This paper tackles the imperative challenge of enhancing the multilingual performance of LLMs without extensive training or fine-tuning. Through systematic investigation and evaluation of diverse l… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Report number: MSR-TR-VeLLM-01

  38. arXiv:2405.18358  [pdf, other

    cs.CL cs.AI cs.CV cs.LG

    MMCTAgent: Multi-modal Critical Thinking Agent Framework for Complex Visual Reasoning

    Authors: Somnath Kumar, Yash Gadhia, Tanuja Ganu, Akshay Nambi

    Abstract: Recent advancements in Multi-modal Large Language Models (MLLMs) have significantly improved their performance in tasks combining vision and language. However, challenges persist in detailed multi-modal understanding, comprehension of complex tasks, and reasoning over multi-modal information. This paper introduces MMCTAgent, a novel multi-modal critical thinking agent framework designed to address… ▽ More

    Submitted 28 May, 2024; originally announced May 2024.

    Report number: MSR-TR-VeLLM-03

  39. arXiv:2405.16401  [pdf, other

    cs.CV cs.LG

    Understanding the Effect of using Semantically Meaningful Tokens for Visual Representation Learning

    Authors: Neha Kalibhat, Priyatham Kattakinda, Arman Zarei, Nikita Seleznev, Samuel Sharpe, Senthil Kumar, Soheil Feizi

    Abstract: Vision transformers have established a precedent of patchifying images into uniformly-sized chunks before processing. We hypothesize that this design choice may limit models in learning comprehensive and compositional representations from visual data. This paper explores the notion of providing semantically-meaningful visual tokens to transformer encoders within a vision-language pre-training fram… ▽ More

    Submitted 25 May, 2024; originally announced May 2024.

  40. arXiv:2405.15683  [pdf, other

    cs.CV cs.AI cs.CL

    VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap

    Authors: Sreyan Ghosh, Chandra Kiran Reddy Evuru, Sonal Kumar, Utkarsh Tyagi, Oriol Nieto, Zeyu Jin, Dinesh Manocha

    Abstract: Recent interest in Large Vision-Language Models (LVLMs) for practical applications is moderated by the significant challenge of hallucination or the inconsistency between the factual information and the generated text. In this paper, we first perform an in-depth analysis of hallucinations and discover several novel insights about how and when LVLMs hallucinate. From our analysis, we show that: (1)… ▽ More

    Submitted 24 May, 2024; originally announced May 2024.

    Comments: Preprint. Under review. Code will be released on paper acceptance

  41. arXiv:2405.11596  [pdf

    cs.CE

    Bioinspired Nested-Isotropic Lattices with Tunable Anisotropy for Additive Manufacturing

    Authors: R. Boda, B. Panda, S. Kumar

    Abstract: This study presents innovative nested-isotropic lattices for additive manufacturing, drawing inspiration from bio-architectures found in cortical bone osteons, golden spirals, and fractals. These lattices provide tunable anisotropy by integrating architectural elements like ``nesting orders (NOs)'' and corresponding ``nesting orientations (NORs),'' along with repetitive self-similar X-cross struts… ▽ More

    Submitted 19 May, 2024; originally announced May 2024.

    Comments: 50 pages, 16 figures in the main text

  42. arXiv:2405.11346  [pdf

    cs.AI

    Decision support system for Forest fire management using Ontology with Big Data and LLMs

    Authors: Ritesh Chandra, Shashi Shekhar Kumar, Rushil Patra, Sonali Agarwal

    Abstract: Forests are crucial for ecological balance, but wildfires, a major cause of forest loss, pose significant risks. Fire weather indices, which assess wildfire risk and predict resource demands, are vital. With the rise of sensor networks in fields like healthcare and environmental monitoring, semantic sensor networks are increasingly used to gather climatic data such as wind speed, temperature, and… ▽ More

    Submitted 18 May, 2024; originally announced May 2024.

  43. arXiv:2405.11029  [pdf, other

    cs.LG cs.AI cs.CL cs.CV

    Generative Artificial Intelligence: A Systematic Review and Applications

    Authors: Sandeep Singh Sengar, Affan Bin Hasan, Sanjay Kumar, Fiona Carroll

    Abstract: In recent years, the study of artificial intelligence (AI) has undergone a paradigm shift. This has been propelled by the groundbreaking capabilities of generative models both in supervised and unsupervised learning scenarios. Generative AI has shown state-of-the-art performance in solving perplexing real-world conundrums in fields such as image translation, medical diagnostics, textual imagery fu… ▽ More

    Submitted 17 May, 2024; originally announced May 2024.

  44. arXiv:2405.06049  [pdf, other

    cs.CV cs.CR cs.LG

    BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization

    Authors: Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru

    Abstract: Deep Learning has become popular due to its vast applications in almost all domains. However, models trained using deep learning are prone to failure for adversarial samples and carry a considerable risk in sensitive applications. Most of these adversarial attack strategies assume that the adversary has access to the training data, the model parameters, and the input during deployment, hence, focu… ▽ More

    Submitted 9 May, 2024; originally announced May 2024.

  45. arXiv:2405.05530  [pdf, other

    cs.CV

    NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry

    Authors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand Tapaswi

    Abstract: Malnutrition among newborns is a top public health concern in developing countries. Identification and subsequent growth monitoring are key to successful interventions. However, this is challenging in rural communities where health systems tend to be inaccessible and under-equipped, with poor adherence to protocol. Our goal is to equip health workers and public health systems with a solution for c… ▽ More

    Submitted 8 May, 2024; originally announced May 2024.

    Comments: Accepted at CVPM Workshop at CVPR 2024

  46. arXiv:2405.04746  [pdf, other

    cs.IR cs.AI cs.LG

    SVD-AE: Simple Autoencoders for Collaborative Filtering

    Authors: Seoyoung Hong, Jeongwhan Choi, Yeon-Chang Lee, Srijan Kumar, Noseong Park

    Abstract: Collaborative filtering (CF) methods for recommendation systems have been extensively researched, ranging from matrix factorization and autoencoder-based to graph filtering-based methods. Recently, lightweight methods that require almost no training have been recently proposed to reduce overall computation. However, existing methods still have room to improve the trade-offs among accuracy, efficie… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

    Comments: Accepted by IJCAI 2024

  47. arXiv:2405.04678  [pdf, other

    cs.NI cs.MA cs.RO

    Pipe Routing with Topology Control for UAV Networks

    Authors: Shreyas Devaraju, Shivam Garg, Alexander Ihler, Sunil Kumar

    Abstract: Routing protocols help in transmitting the sensed data from UAVs monitoring the targets (called target UAVs) to the BS. However, the highly dynamic nature of an autonomous, decentralized UAV network leads to frequent route breaks or traffic disruptions. Traditional routing schemes cannot quickly adapt to dynamic UAV networks and/or incur large control overhead and delays. To establish stable, high… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  48. arXiv:2404.15618  [pdf, other

    stat.ML cs.LG

    Neural Operator induced Gaussian Process framework for probabilistic solution of parametric partial differential equations

    Authors: Sawan Kumar, Rajdip Nayek, Souvik Chakraborty

    Abstract: The study of neural operators has paved the way for the development of efficient approaches for solving partial differential equations (PDEs) compared with traditional methods. However, most of the existing neural operators lack the capability to provide uncertainty measures for their predictions, a crucial aspect, especially in data-driven scenarios with limited available data. In this work, we p… ▽ More

    Submitted 23 April, 2024; originally announced April 2024.

  49. arXiv:2404.13886  [pdf, other

    cs.OS cs.ET

    Taming Server Memory TCO with Multiple Software-Defined Compressed Tiers

    Authors: Sandeep Kumar, Aravinda Prasad, Sreenivas Subramoney

    Abstract: Memory accounts for 33 - 50% of the total cost of ownership (TCO) in modern data centers. We propose a novel solution to tame memory TCO through the novel creation and judicious management of multiple software-defined compressed memory tiers. As opposed to the state-of-the-art solutions that employ a 2-Tier solution, a single compressed tier along with DRAM, we define multiple compressed tiers i… ▽ More

    Submitted 22 April, 2024; originally announced April 2024.

  50. Visualizing Intelligent Tutor Interactions for Responsive Pedagogy

    Authors: Grace Guo, Aishwarya Mudgal Sunil Kumar, Adit Gupta, Adam Coscia, Chris MacLellan, Alex Endert

    Abstract: Intelligent tutoring systems leverage AI models of expert learning and student knowledge to deliver personalized tutoring to students. While these intelligent tutors have demonstrated improved student learning outcomes, it is still unclear how teachers might integrate them into curriculum and course planning to support responsive pedagogy. In this paper, we conducted a design study with five teach… ▽ More

    Submitted 19 April, 2024; originally announced April 2024.

    Comments: 9 pages, 5 figures, ACM AVI 2024