LLMs Research
Posts
Summary of LLMs research papers published on 17th April

Summary of LLMs research papers published on 17th April

A detailed summary and categorization of research papers published for LLMs.

April 18, 2024

Dear subscribers,

I am planning to revamp a newsletter to better track research in LLMs in following manner. Please share your thoughts and valuable feedback to me on our Reddit, Twitter, and LinkedIn channels.

Research categorization of LLMs: A better way to track research in LLMs

New Models & 🔥 research:

Deeplearning.ai launched a new short course: Quantization Fundamentals with Hugging Face
Stanford University published: 2024 AI Index Report
Llama 3 is coming!!
Mixtral AI published Mixtral-8x22B-v0.1 and Mixtral-8x22B-Instruct-v0.1 models!
A new robot (Atlas), from Boston dynamics!

Atlas robot from Boston dynamics

Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization

The research paper proposes a method called Pack of LLMs (PackLLM) for test-time fusion, which allows for leveraging knowledge from arbitrary user-specified LLMs during inference. PackLLM works by solving an optimization problem to determine the importance of each LLM, based on their perplexity over the input prompt. This importance weight is then used to combine the LLMs during inference.

AgentKit: Flow Engineering with Graphs, not Coding

🔗Code: https://github.com/holmeswww/AgentKit

The research paper proposes an intuitive LLM prompting framework called AgentKit, which offers a unified framework for constructing a complex "thought process" from simple natural language prompts.

AgentKit is designed to be modular and intuitive, allowing non-programmers to design and tune basic agents using only a list of prompts for subtasks.

A Preference-driven Paradigm for Enhanced Translation with Large Language Models 🔥

Amazon AGI team propose a paper to improve the translation performance using LLMs.

The research paper proposes a preference-based approach using the Plackett-Luce model to improve translation performance in LLMs. This approach focuses on understanding translation preferences from a holistic view, rather than simply imitating reference translations at the token level. It is also designed to be more resilient in the absence of gold translations.

In-Context Learning State Vector with Inner and Momentum Optimization 🔥🔥

🔗Code: https://github.com/HITsz-TMG/ICL-State-Vector

🤔Problem?:
Paper helps understanding of the working mechanisms and optimization of compressed vectors derived from transformers in Large Language Models (LLMs). These compressed vectors are used for In-Context Learning (ICL) and have shown impressive performance in learning from few examples.

💻Proposed solution:
The research paper proposes a comprehensive analysis of these compressed vectors and introduces the concept of a "state vector" which is inspired by the concept of model soup and momentum-based gradient descent.

The state vector is optimized using inner and momentum optimization methods to refine it progressively during test-time adaptation. Additionally, a divide-and-conquer aggregation method is proposed to address the challenge of using demonstrations with numerous examples in ICL. This approach enhances the state vector and leads to state-of-the-art performance on various tasks.

Many-Shot In-Context Learning [Google DeepMind 🔥🔥🔥]

How much performance gain 📈 can many shot in-context learning bring to the table? well, Google has an answer! See the image above 👆

DeepMind team used Gemini 1.5 Pro which has 1 million token context length. But, when we move to many shot in-context learning we might not have a crafted user responses. To order this issue, team proposed two new settings - Reinforced and Unsupervised ICL.

Reinforced ICL uses model-generated chain-of-thought rationales instead of human examples to train the LLMs. This approach works by prompting the model with specific questions related to the domain, and the model generates its own reasoning chain to answer the questions.

Unsupervised ICL, on the other hand, removes the rationales from the prompt altogether and only prompts the model with domain-specific questions. This approach allows the model to learn from a larger number of examples without relying on human-generated data.

Position Engineering: Boosting Large Language Models through Positional Information Manipulation 🔥🔥🔥

Microsoft research team proposes a new technique called "position engineering" to improve the performance of LLMs. This technique alters the positional information in the prompt without changing the text itself. This approach is more efficient compared to traditional prompt engineering methods which require extensive modifications to the text.

The results showed significant improvements in performance compared to the baseline.

An Empirical Evaluation of Pre-trained Large Language Models for Repairing Declarative Formal Specifications

The research paper proposes a novel repair pipeline that integrates a dual-agent Large Language Model (LLM) framework. This framework consists of a Repair Agent and a Prompt Agent. The Repair Agent uses LLMs, specifically GPT-4 variants, to suggest potential repairs for bugs in the declarative specifications. The Prompt Agent then provides feedback and prompts the Repair Agent to generate better solutions. This process continues until a satisfactory repair is found.

Papers with database/benchmarks:

Quantifying Multilingual Performance of Large Language Models Across Languages

ViLLM-Eval: A Comprehensive Evaluation Suite for Vietnamese Large Language Models

📚Want to learn more, Survey paper:

Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models

LLMs for Cyber Security: New Opportunities

Low-Cost Language Models: Survey and Performance Evaluation on Python Code Generation

A Survey on Retrieval-Augmented Text Generation for Large Language Models

🧯Let’s make LLMs safe!! (LLMs security related papers)

Embedding Privacy in Computational Social Science and Artificial Intelligence Research

DUPE: Detection Undermining via Prompt Engineering for Deepfake Text

Sampling-based Pseudo-Likelihood for Membership Inference Attacks