- LLMs Research
- Posts
- May 20th, 2024
May 20th, 2024
π takeaway from todayβs newsletter
OpenRLHF: Easy to use, open source RLHF framework
RapidIn: Identifying training data used to generate inference
Framework to quantify in-context and memorization of LLMs
TDD: Understanding the role of each token on response
KG-RAG: Knowledge graphs meet language model agents!
Core research improving LLMs!
π‘Why?: The research paper addresses the challenge of scaling reinforcement learning from human feedback for training LLMs.
π»How?: To solve this problem, the paper proposes OpenRLHF, an open-source framework that uses Ray, vLLM, and DeepSpeed to efficiently coordinate four models with over 70B parameters. This framework improves resource utilization and utilizes diverse training approaches. It also integrates with Hugging Face for user-friendliness and implements various techniques such as RLHF, DPO, and rejection sampling for alignment.
π‘Why?: The research paper addresses the problem of identifying which training data contributed to a given generation by a LLM. This is important for improving the transparency and interpretability of LLMs, as well as for debugging and enhancing their performance.
π»How?: The research paper proposes a framework called RapidIn, which consists of two stages:
caching, and
retrieval
In the caching stage, the gradient vectors are compressed by over 200,000x and stored in disk or GPU/CPU memory. In the retrieval stage, given a generation, RapidIn efficiently traverses the cached gradients to estimate the influence of each training data within minutes, achieving a speedup of over 6,326x. It also supports multi-GPU parallelization to further accelerate the process.
πResults: The research paper reports an empirical result that confirms the efficiency and effectiveness of RapidIn. It achieved a speedup of over 6,326x compared to traditional methods of estimating influence, making it a highly efficient and scalable solution.
π‘Why?: The research paper tries to quantify the memorization and in-context reasoning effects used by LLMs for language generation.
π»How?: The research paper proposes an axiomatic system that categorizes and quantifies the memorization and in-context reasoning effects of LLMs. This system works by formulating the effects as non-linear interactions between the tokens/words encoded by the LLM. It also enables a clear disentanglement of these effects, making it easier to examine the detailed inference patterns encoded by LLMs.
π‘Why?: The research paper explore to understand the role of individual tokens in shaping the responses of LLMs and aims to provide a solution for accurate interpretation and manipulation of prompts.
π»How?: The research paper proposes a method called Token Distribution Dynamics (TDD) to unveil and manipulate the role of prompts in generating LLM outputs. TDD leverages the interpreting capabilities of the language model head (LM head) to assess input saliency. It projects input tokens into the embedding space and uses distribution dynamics over the vocabulary to estimate their significance. TDD offers three variants - forward, backward, and bidirectional - to provide unique insights into token relevance.
πResults: Paper surpasses state-of-the-art baselines by a significant margin in elucidating the causal relationships between prompts and LLM outputs. Additionally, TDD is also applied to two prompt manipulation tasks for controlled text generation - zero-shot toxic language suppression and sentiment steering - and shows proficiency in identifying and mitigating toxicity or modulating sentiment in the generated content.
π‘Why?: The research paper ensures the factual accuracy in the development of intelligent agent systems while still maintaining the creative capabilities of Large Language Model Agents (LMAs).
π»How?: The research paper proposes a solution called the KG-RAG (Knowledge Graph-Retrieval Augmented Generation) pipeline. This framework integrates structured Knowledge Graphs (KGs) with the functionalities of LMAs to enhance their knowledge capabilities. The pipeline first constructs a KG from unstructured text and then performs information retrieval over the graph to answer knowledge-based questions. This is done using a novel algorithm called Chain of Explorations (CoE) which leverages LLMs reasoning to sequentially explore nodes and relationships within the KG.
πResults: The preliminary experiments on the ComplexWebQuestions dataset showed notable improvements in reducing hallucinated content and suggest a promising path towards developing intelligent systems adept at handling knowledge-intensive tasks. However, the paper does not mention specific performance improvements.
π‘Why?: The research paper addresses the problem of ineffective domain transferability in LLM-based clarification strategies for conversational search engines.
π»How?: The paper proposes a novel method called Style, which aims to improve domain transferability by producing tailored strategies for each individual domain. Style utilizes the context understanding capability of LLMs and their access to domain-specific sources of knowledge to create effective strategies for each domain in a post-hoc manner. This allows for rapid transferability and better performance on unseen domains.
πResults: The experimental results of the research paper showed an average search performance improvement of ~10% on four unseen domains when using the Style method. This demonstrates the effectiveness of the proposed method in achieving domain transferability and improving search performance.
LLMs evaluations
Letβs make LLMs safe!!
π‘Why?: The research paper aims to investigate the potential for privacy invasion through input reconstruction attacks, where a malicious model provider could potentially recover user inputs from embeddings.
π»How?: To solve this problem, the research paper proposes two base methods for reconstructing original texts from a model's hidden states. These methods are effective in attacking embeddings from shallow layers, but their effectiveness decreases when attacking embeddings from deeper layers. To address this issue, the paper introduces a new method called Embed Parrot, which is a Transformer-based approach that can effectively reconstruct input from embeddings in deep layers. This method is able to showcase stable performance across various token lengths and data distributions. Additionally, the paper also suggests a defense mechanism to deter exploitation of the embedding reconstruction process.
π‘Why?: The research paper addresses the potential risks of data contamination in LLMs due to the use of unchecked ultra-large-scale training sets.
π»How?: The research paper proposes a holistic method called Polarized Augment Calibration (PAC) to detect and diminish the effects of data contamination. PAC extends the popular MIA (Membership Inference Attack) technique by targeting and clarifying invisible training data. It can be easily integrated with existing white- and black-box LLMs and is a plug-and-play solution.
πResults: The research paper outperforms existing methods by at least 4.5% in detecting data contamination on more than 4 dataset formats and with over 10 base LLMs. Additionally, real-world scenarios highlight the significant presence of data contamination and related issues.
Creative ways to use LLMs!!
The research paper proposes a novel pipeline approach for human travel trajectory mining. This approach utilizes large language models (LLMs) to annotate Points of Interest (POI) with activity types and then uses a Bayesian-based algorithm to infer activities for each stay point in a trajectory. This approach leverages the strong inferential and comprehension capabilities of LLMs to enhance the quality of trajectory mining by incorporating semantic information from POI data. It works by first annotating the POIs with activity types and then using a Bayesian-based algorithm to infer activities for each stay point in a trajectory.
The research paper proposes a novel multi-agent framework, implemented as a company called TransAgents, to address the complex demands of translating literary works. This framework leverages the collective capabilities of multiple agents, based on large language models, and mirrors the traditional translation publication process. It also introduces two innovative evaluation strategies, MHP and BLP, to assess the effectiveness of the system. MHP evaluates translations from the perspective of monolingual readers, while BLP compares translations directly with the original texts using advanced language models.
This paper showcase designing a commanding system for game agents.
It uses a code-generation LLM to translate the player's language commands into behaviour branches. This allows for a more flexible and dynamic approach compared to traditional rule-based methods. The game agents can then understand and execute a wider range of commands, enhancing the overall gameplay experience for players.
This research paper proposes a framework called SetItUp, which uses a combination of training examples and a human-crafted program sketch to learn arrangement rules for specific scene types. It decomposes the arrangement problem into two subproblems: learning arrangement patterns from limited data and grounding these abstract relationships into object poses. SetItUp leverages LLMs to propose abstract spatial relationships among objects and uses a library of diffusion models to find object poses that satisfy these relationships.
π‘Why?: Enhancing single object tracking (SOT) by integrating natural language descriptions from a video, in order to achieve more precise tracking of a specified object.
π»How?: Introducing a new method called DTLLM-VLT (Dynamic Text Language Model for Visual Language Tracking). This method automatically generates scientific and multi-granularity text descriptions using a cohesive prompt framework, which can be seamlessly integrated into various visual tracking benchmarks. It leverages high-level semantic information to guide object tracking, thereby alleviating the constraints associated with relying on a visual modality. Additionally, it offers four granularity combinations for different tracking benchmarks, taking into consideration the extent and density of semantic information. This allows for a more fine-grained evaluation of multi-modal trackers.
This research paper proposes a prompt-guided interaction procedure. This involves asking a powerful LLM to assign sensible skill labels to math questions, followed by performing semantic clustering to obtain coarser families of skill labels. These coarse skill labels are then presented to the LLM while solving test questions, and it is asked to identify the skill needed. It is then presented with randomly selected exemplar solved questions associated with that skill label. This methodology is domain-agnostic and can be applied to any problem, although the research paper focuses on math problems.
πResults: The research paper achieves improved accuracy on two math datasets (GSM8K and MATH) for several strong LLMs, including code-assisted models. This improvement is seen when the LLM is presented with the full list of skill labels and asked to identify the skill needed to solve a particular question. This suggests that the skill labels obtained through the prompt
Reply