How do language models comprehend gibberish inputs? Our recent work with James Zou focuses on understanding the mechanisms by which LLMs can be manipulated into responding with coherent target text to seemingly gibberish inputs. Paper: https://lnkd.in/gA9Mjqc4 A few takeaways: In this work we show the prevalence of nonsensical prompts that induce LLMs to generate specific and coherent responses, which we call LM Babel. We examine the structure of Babel prompts and find that despite their high perplexity, these prompts often contain nontrivial trigger tokens, maintain lower entropy compared to random token strings, and cluster together in the model representation space. We find that the efficiency of these prompts largely depends on the prompt length as well as target text’s length and perplexity. We show that reproducing harmful texts with aligned models is not only feasible but, in some cases, even easier compared to benign texts, while fine-tuning language models to forget specific information complicates directing them towards unlearned content.
Valeriia Cherepanova’s Post
More Relevant Posts
-
Summary of the article “Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? A.K.A. Will LLMs Replace Knowledge Graphs?” Article link: https://lnkd.in/eihKpfya The article discusses a novel approach to evaluating the knowledge of Large Language Models (LLMs) by introducing the Head-to-Tail benchmark, which comprises question-answer pairs categorised into head, torso, and tail entities based on their popularity. This categorisation is determined by factors such as traffic and density. The evaluation metrics employed include accuracy (A), hallucination rate (H), and missing rate (M), which measure correctness, partial correctness, and incorrectness of LLM responses. Various evaluation methods, including LLM-Based and Rule-Based approaches, are used to determine answer correctness. The experimental findings suggest that while LLMs do possess encoded factual knowledge within their parameters, the quantity of this knowledge is limited. This limitation is particularly pronounced for long-tail entities, which are already sparsely represented in Knowledge Graphs (KGs) and are even more deficient in LLMs. The evaluation indicates notable limitations in LLMs' representation of factual knowledge, especially for torso and tail entities. These findings point towards the need for further research to address the challenge of seamlessly integrating knowledge in both symbolic and neural forms. This implies exploring new ways to enhance the factual knowledge representation capabilities of LLMs, particularly for less popular entities in KGs. Reference: Sun, K., Xu, Y. E., Zha, H., Liu, Y., & Dong, X. L. (2023). Head-to-Tail: How Knowledgeable are Large Language Models (LLM)? A.K.A. Will LLMs Replace Knowledge Graphs? (Version 1). arXiv. DOI: 10.48550/ARXIV.2308.10168 #artificialintelligence #knowledgegraph #largelanguagemodels
To view or add a comment, sign in
-
IIT-J | IIIT-B | AWS Machine Learning Speciality Certified | Microsoft Certified: Azure Data Scientist Associate | Generative AI | LLM | Trying to learn how machines learn
LLM as Optimizer! Indeed! This research focuses on the utilization of large-scale language models, such as the one now engaged in this conversation, for the purpose of addressing optimization challenges. Optimization problems are ubiquitous in several domains, encompassing scenarios wherein one endeavors to identify the optimal answer from a given set of potential options. Typically, the resolution of these issues entails the utilization of algorithmic approaches rooted in mathematics. However, this scholarly article posits an alternative approach wherein effective answers can be derived by engaging in conversational interactions using a language model. This is the operational mechanism: The authors present a methodology known as OPRO, an acronym derived from "Optimization by PROmpting." The user's text can be revised to be more academic as follows: The user succinctly articulates the problem they aim to address using simple language, and requests the language model to propose a resolution. The researchers evaluate the efficacy of this concept by applying it to well-known situations, such as determining the most efficient route for a traveling salesperson and performing linear regression to fit a line to a given collection of data points. Additionally, it is employed to enhance the efficacy of language models by the optimization of the prompts provided to them. The utilization of language models has several advantages. Notably, one notable advantage is the ability to customize the problem-solving approach by simply modifying the manner in which it is articulated to the model. How Does the Model Acquire Knowledge of its Actions?: The model is provided with a "meta-prompt," which refers to the provision of contextual information and past solutions, enabling it to generate improved novel answers. issues: Several issues arise in the context of this study, including the need to mitigate the tendency of the model to repetitively propose identical suggestions and to strike a balance between exploring novel ideas and enhancing existing ones. Efficacy Assessment: Based on their conducted tests, affirmative outcomes were seen. In several instances, the solutions generated by language models that were optimized outperformed those produced by human beings. In summary, the research posits that language models can be employed to address intricate problems with a considerable degree of efficacy. #llm #deeplearning #generatieveai #deepmind Paper : https://lnkd.in/gY8Gw7eB Code : https://lnkd.in/gy58xnKm
Large Language Models as Optimizers
arxiv.org
To view or add a comment, sign in
-
Do Large Language Models Latently Perform Multi-Hop Reasoning? | Google Sohee Yang, Elena Gribovskaya, Nora Kassner, Mor Geva, Sebastian Riedel Abstract We study whether Large Language Models (LLMs) latently perform multi-hop reasoning with complex prompts such as "The mother of the singer of 'Superstition' is". We look for evidence of a latent reasoning pathway where an LLM (1) latently identifies "the singer of 'Superstition'" as Stevie Wonder, the bridge entity, and (2) uses its knowledge of Stevie Wonder's mother to complete the prompt. We analyze these two hops individually and consider their co-occurrence as indicative of latent multi-hop reasoning. For the first hop, we test if changing the prompt to indirectly mention the bridge entity instead of any other entity increases the LLM's internal recall of the bridge entity. For the second hop, we test if increasing this recall causes the LLM to better utilize what it knows about the bridge entity. We find strong evidence of latent multi-hop reasoning for the prompts of certain relation types, with the reasoning pathway used in more than 80% of the prompts. However, the utilization is highly contextual, varying across different types of prompts. Also, on average, the evidence for the second hop and the full multi-hop traversal is rather moderate and only substantial for the first hop. Moreover, we find a clear scaling trend with increasing model size for the first hop of reasoning but not for the second hop. Our experimental findings suggest potential challenges and opportunities for future development and applications of LLMs. 👉 https://lnkd.in/dgdJnQfw #machinelearning
To view or add a comment, sign in
-
-
Does the order in which we present information influence our ability to reason and make decisions? This question forms the core of a exploration into the workings of large language models (LLMs) and their reasoning capabilities. Let's consider a simple example to illustrate how the order of information can affect the performance of language models like GPT-4-turbo in reasoning tasks. Imagine we have a reasoning problem based on the following premises: If it rains, the ground will be wet. (Premise A) It is raining. (Premise B) The logical conclusion from these premises is that the ground will be wet. Now, if we present these premises to the language model in the order they're listed above (A then B), it aligns with the logical steps needed to reach the conclusion. The model sees that it's raining (Premise B) and, based on Premise A, it can easily conclude that the ground must be wet. However, if we mix up the order and present Premise B before Premise A, like this: It is raining. (Premise B) If it rains, the ground will be wet. (Premise A) The model might still reach the correct conclusion, but this rearranged order could potentially make it harder for the model to process the information as smoothly as before. This is because it first gets the information that it's raining, but without the immediate context of what happens when it rains, which comes from Premise A. When the information is presented in an order that logically builds up to the conclusion, it's easier for the model to follow along and make the right deduction. This behavior highlights a potential limitation in their reasoning capabilities and suggests that their performance might be influenced by the structure and presentation of information, rather than just its logical content. https://lnkd.in/ghnsDRbX
Premise Order Matters in Reasoning with Large Language Models
arxiv.org
To view or add a comment, sign in
-
Observational Scaling Laws and the Predictability of Language Model Performance Yangjun Ruan, Chris J. Maddison, Tatsunori Hashimoto Abstract Understanding how language model performance varies with scale is critical to benchmark and algorithm development. Scaling laws are one approach to building this understanding, but the requirement of training models across many different scales has limited their use. We propose an alternative, observational approach that bypasses model training and instead builds scaling laws from ~80 publically available models. Building a single scaling law from multiple model families is challenging due to large variations in their training compute efficiencies and capabilities. However, we show that these variations are consistent with a simple, generalized scaling law where language model performance is a function of a low-dimensional capability space, and model families only vary in their efficiency in converting training compute to capabilities. Using this approach, we show the surprising predictability of complex scaling phenomena: we show that several emergent phenomena follow a smooth, sigmoidal behavior and are predictable from small models; we show that the agent performance of models such as GPT-4 can be precisely predicted from simpler non-agentic benchmarks; and we show how to predict the impact of post-training interventions like Chain-of-Thought and Self-Consistency as language model capabilities continue to improve. 👉 https://lnkd.in/dy-XTGB3 #machinelearning
To view or add a comment, sign in
-
-
A method to create performance rankings of LLMs without human feedback:
A method to rank large language models by performance which does not need human feedback and does not rely on a single rating A.I. Current benchmarks to measure the performance of large language models (LLMs) require either human feedback (i.e. gold standard answers) or rely on strong LLMs for rating. The „rating network“ method presented in the following blog post works without these requirements. https://lnkd.in/erKctgXm
Performance rankings of large language models without strong LLM reference or human judgments / gold references
lardel.li
To view or add a comment, sign in
-
Knowledge Fusion of Large Language Models Fanqi Wan, Xinting Huang, Deng Cai, Xiaojun Quan, Wei Bi, Shuming Shi Abstract While training large language models (LLMs) from scratch can generate models with distinct functionalities and strengths, it comes at significant costs and may result in redundant capabilities. Alternatively, a cost-effective and compelling approach is to merge existing pre-trained LLMs into a more potent model. However, due to the varying architectures of these LLMs, directly blending their weights is impractical. In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM. By leveraging the generative distributions of source LLMs, we externalize their collective knowledge and unique strengths, thereby potentially elevating the capabilities of the target model beyond those of any individual source LLM. We validate our approach using three popular LLMs with different architectures--Llama-2, MPT, and OpenLLaMA--across various benchmarks and tasks. Our findings confirm that the fusion of LLMs can improve the performance of the target model across a range of capabilities such as reasoning, commonsense, and code generation. Our code, model weights, and data are public at https://lnkd.in/dW555K-X #machinelearning
Knowledge Fusion of Large Language Models
arxiv.org
To view or add a comment, sign in
-
https://lnkd.in/gnxfPrWi Large Language Models are currently all the rage and they are quite helpful in getting research accomplished. How do they work? Terms are grouped according to their similarity, forming clustered islands of concepts and semantic comprehensions. These clusters are then situated within an atmosphere, creating what is known as an embedding space. To unleash and fully utilize the power of the LLM, we engage in a statistical game of hopscotch, swiftly moving from island-to-island, grouping-to-grouping.
Ways to Understand the Operation of LLMs Intuitively.
satisologie.substack.com
To view or add a comment, sign in
-
Defining tests in natural language using the same terminology the business uses is one of the ways we can both help non-technical team members get involved with quality and make sure we're testing the right things. But it's always come with challenges - behind that natural language is a whole lot of code wiring it to actions that can be taken within the system. What if the language changes but the code doesn't? What if the interface changes? Here's an example of using natural language to test a web sign up with no code. Behind the scenes this takes a screenshot, hands it over to GPT-4V and asks for an action to take to achieve an objective. Would a GPT tester give you confidence to release to production? #artificialintelligence #gpt4v
To view or add a comment, sign in
-
Meet LLM-Blender: A Novel Ensembling Framework to Attain Consistently Superior Performance by Leveraging the Diverse Strengths of Multiple Open-Source Large Language Models (LLMs) https://lnkd.in/e2exYcha
Meet LLM-Blender: A Novel Ensembling Framework to Attain Consistently Superior Performance by Leveraging the Diverse Strengths of Multiple Open-Source Large Language Models (LLMs)
https://www.marktechpost.com
To view or add a comment, sign in