Skip to main content

Showing 1–10 of 10 results for author: Raman, M

  1. arXiv:2404.07815  [pdf, other

    cs.LG cs.AI stat.ML

    Post-Hoc Reversal: Are We Selecting Models Prematurely?

    Authors: Rishabh Ranjan, Saurabh Garg, Mrigank Raman, Carlos Guestrin, Zachary Chase Lipton

    Abstract: Trained models are often composed with post-hoc transforms such as temperature scaling (TS), ensembling and stochastic weight averaging (SWA) to improve performance, robustness, uncertainty estimation, etc. However, such transforms are typically applied only after the base models have already been finalized by standard means. In this paper, we challenge this practice with an extensive empirical st… ▽ More

    Submitted 11 April, 2024; originally announced April 2024.

    Comments: 9 pages + references + appendix, 7 figures

  2. arXiv:2311.18071  [pdf, other

    cs.CV

    Turn Down the Noise: Leveraging Diffusion Models for Test-time Adaptation via Pseudo-label Ensembling

    Authors: Mrigank Raman, Rohan Shah, Akash Kannan, Pranit Chawla

    Abstract: The goal of test-time adaptation is to adapt a source-pretrained model to a continuously changing target domain without relying on any source data. Typically, this is either done by updating the parameters of the model (model adaptation) using inputs from the target domain or by modifying the inputs themselves (input adaptation). However, methods that modify the model suffer from the issue of comp… ▽ More

    Submitted 29 November, 2023; originally announced November 2023.

    Comments: Accepted to Workshop on Distribution Shifts: New Frontiers with Foundation Models at Neurips 2023

  3. arXiv:2307.16395  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    Bridging the Gap: Exploring the Capabilities of Bridge-Architectures for Complex Visual Reasoning Tasks

    Authors: Kousik Rajesh, Mrigank Raman, Mohammed Asad Karim, Pranit Chawla

    Abstract: In recent times there has been a surge of multi-modal architectures based on Large Language Models, which leverage the zero shot generation capabilities of LLMs and project image embeddings into the text space and then use the auto-regressive capacity to solve tasks such as VQA, captioning, and image retrieval. We name these architectures as "bridge-architectures" as they project from the image sp… ▽ More

    Submitted 30 July, 2023; originally announced July 2023.

  4. arXiv:2303.07320  [pdf, other

    cs.CL cs.LG

    Model-tuning Via Prompts Makes NLP Models Adversarially Robust

    Authors: Mrigank Raman, Pratyush Maini, J. Zico Kolter, Zachary C. Lipton, Danish Pruthi

    Abstract: In recent years, NLP practitioners have converged on the following practice: (i) import an off-the-shelf pretrained (masked) language model; (ii) append a multilayer perceptron atop the CLS token's hidden representation (with randomly initialized weights); and (iii) fine-tune the entire model on a downstream task (MLP-FT). This procedure has produced massive gains on standard NLP benchmarks, but t… ▽ More

    Submitted 5 December, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: Accepted to the EMNLP 2023 Conference

  5. arXiv:2103.01134  [pdf, other

    cs.LG cs.CV

    Domain Generalization via Inference-time Label-Preserving Target Projections

    Authors: Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP

    Abstract: Generalization of machine learning models trained on a set of source domains on unseen target domains with different statistics, is a challenging problem. While many approaches have been proposed to solve this problem, they only utilize source data during training but do not take advantage of the fact that a single target example is available at the time of inference. Motivated by this, we propose… ▽ More

    Submitted 18 July, 2021; v1 submitted 1 March, 2021; originally announced March 2021.

  6. arXiv:2010.16095  [pdf, other

    cs.LG

    Centralized active tracking of a Markov chain with unknown dynamics

    Authors: Mrigank Raman, Ojal Kumar, Arpan Chattopadhyay

    Abstract: In this paper, selection of an active sensor subset for tracking a discrete time, finite state Markov chain having an unknown transition probability matrix (TPM) is considered. A total of N sensors are available for making observations of the Markov chain, out of which a subset of sensors are activated each time in order to perform reliable estimation of the process. The trade-off is between activ… ▽ More

    Submitted 30 October, 2020; originally announced October 2020.

    Comments: Accepted at the IEEE MASS 2020 Conference

  7. arXiv:2010.12873  [pdf, other

    cs.CL

    Learning Contextualized Knowledge Structures for Commonsense Reasoning

    Authors: Jun Yan, Mrigank Raman, Aaron Chan, Tianyu Zhang, Ryan Rossi, Handong Zhao, Sungchul Kim, Nedim Lipka, Xiang Ren

    Abstract: Recently, knowledge graph (KG) augmented models have achieved noteworthy success on various commonsense reasoning tasks. However, KG edge (fact) sparsity and noisy edge extraction/generation often hinder models from obtaining useful knowledge to reason over. To address these issues, we propose a new KG-augmented model: Hybrid Graph Network (HGN). Unlike prior methods, HGN learns to jointly context… ▽ More

    Submitted 4 June, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: Accepted to Findings of ACL-IJCNLP 2021. Code and data: https://github.com/INK-USC/HGN

  8. arXiv:2010.12872  [pdf, other

    cs.CL cs.AI cs.LG

    Learning to Deceive Knowledge Graph Augmented Models via Targeted Perturbation

    Authors: Mrigank Raman, Aaron Chan, Siddhant Agarwal, Peifeng Wang, Hansen Wang, Sungchul Kim, Ryan Rossi, Handong Zhao, Nedim Lipka, Xiang Ren

    Abstract: Knowledge graphs (KGs) have helped neural models improve performance on various knowledge-intensive tasks, like question answering and item recommendation. By using attention over the KG, such KG-augmented models can also "explain" which KG information was most relevant for making a given prediction. In this paper, we question whether these models are really behaving as we expect. We show that, th… ▽ More

    Submitted 3 May, 2021; v1 submitted 24 October, 2020; originally announced October 2020.

    Comments: 13 pages, 11 figures

  9. arXiv:2007.14284  [pdf, other

    cs.CV cs.LG

    Discrepancy Minimization in Domain Generalization with Generative Nearest Neighbors

    Authors: Prashant Pandey, Mrigank Raman, Sumanth Varambally, Prathosh AP

    Abstract: Domain generalization (DG) deals with the problem of domain shift where a machine learning model trained on multiple-source domains fail to generalize well on a target domain with different statistics. Multiple approaches have been proposed to solve the problem of domain generalization by learning domain invariant representations across the source domains that fail to guarantee generalization on t… ▽ More

    Submitted 28 July, 2020; originally announced July 2020.

  10. arXiv:1908.11648  [pdf, other

    cs.OS

    Porting of eChronos RTOS on RISC-V Architecture

    Authors: Shubhendra Pal Singhal, M. Sridevi, N Sathya Narayanan, M J Shankar Raman

    Abstract: eChronos is a formally verified Real Time Operating System(RTOS) designed for embedded micro-controllers. eChronos was targeted for tightly constrained devices without memory management units. Currently, eChronos is available on proprietary designs like ARM, PowerPC and Intel architectures. eChronos is adopted in safety critical systems like aircraft control system and medical implant devices. eCh… ▽ More

    Submitted 26 December, 2019; v1 submitted 30 August, 2019; originally announced August 2019.

    Comments: 11 pages, 3 figures, Accepted for Publication for Springer LNCS Germany

    Report number: Submission Id - 205