LLMs Research
Posts
LLMs research papers published on April 24th

LLMs research papers published on April 24th

Summary of research papers categorized in following categories: LLMs performance improvement techniques, generative agents, LLMs benchmarks, LLMs survey paper, LLMs applications and survey papers

April 25, 2024

LLMs core performance improvement papers

URL: Universal Referential Knowledge Linking via Task-instructed Representation Compression

🤔Problem?:
The research paper addresses the problem of limited and fixed claim-reference relationships in current studies on referential knowledge linking (RKL). While RKL is critical for fulfilling human demands for authentic and reliable information, existing approaches are restricted to specific tasks and may not be able to handle the diverse and complex referential knowledge linking tasks in the real-world.

💻Proposed solution:
To solve this problem, the research paper proposes a universal referential knowledge linking (URL) model that aims to tackle various RKL tasks with one unified framework. This is achieved through a LLM-driven task-instructed representation compression, which effectively adapts the instruction following and semantic understanding abilities of LLMs to RKL. The proposed model also utilizes a multi-view learning approach to further enhance its performance. In summary, the URL model combines the strengths of LLMs and multi-view learning to effectively handle diverse and complex RKL tasks.

Online Personalizing White-box LLMs Generation with Neural Bandits

🤔Problem?: The research paper tackles efficiently personalizing text generation by LLMs to meet individual preferences without creating a unique model for each user.

💻Proposed solution: The research paper proposes an innovative method that uses neural bandit algorithms to dynamically optimize soft instruction embeddings based on user feedback. This enhances the personalization of open-ended text generation by white-box LLMs. Essentially, the method allows the LLM to adapt and improve its text generation based on how the user responds to the generated content.

📊Results: Framework achieves up to a 62.9% improvement in terms of best ROUGE scores and up to a 2.76% increase in LLM-agent evaluation against the baseline in personalized news headline generation.

zkLLM: Zero Knowledge Proofs for Large Language Models

The research paper proposes zkLLM, a specialized zero-knowledge proof tailored for LLMs to verify the correctness of the entire inference process of LLMs, while also protecting the privacy of the model's parameters. It works by leveraging the foundation of tlookup, a parallelized lookup argument designed for non-arithmetic tensor operations in deep learning. Additionally, the paper introduces zkAttn, a specialized zero-knowledge proof crafted for the attention mechanism in LLMs, which balances considerations of running time, memory usage, and accuracy.

📊Results:
The paper claims that their approach, powered by a fully parallelized CUDA implementation, enables the generation of a correctness proof for the entire inference process of LLMs with 13 billion parameters in under 15 minutes. The resulting proof is also compact in size, at less than 200 kB, and ensures no inadvertent information leakage.

Cantor: Inspiring Multimodal Chain-of-Thought of MLLM

🔗GitHub: https://ggg0919.github.io/cantor/

🤔Problem?:The paper aims to address the challenge of "determining hallucinations" in decision-making when tackling visual reasoning tasks due to insufficient visual information and the limitations of low-level perception tools.

💻Proposed solution:
The paper proposes an innovative framework called Cantor, which combines visual context acquisition and logical reasoning to enhance visual reasoning tasks. Cantor utilizes large language models (LLMs) enhanced by the chain-of-thought (CoT) methodology to decompose the problem into manageable sub-tasks and sequentially tackle them with various external tools. It also integrates visual inputs to analyze the image and problem, ensuring a closer alignment with the actual context. Additionally, Cantor leverages the advanced cognitive functions of multimodal large language models (MLLMs) to derive higher-level information, enhancing the CoT generation process.

A Human-Computer Collaborative Tool for Training a Single Large Language Model Agent into a Network through Few Examples

The research paper proposes EasyLAN, a human-computer collaborative tool, to help developers construct LANs. It works by initially generating a LAN with one agent based on the task description, and then using a few training examples to update the LAN. EasyLAN models the gap between the output and the ground truth, identifies errors, and addresses them through carefully designed strategies. Users can intervene or modify the LAN directly. This process allows the LAN to evolve from a single agent to a network of LLM agents.

Sequence can Secretly Tell You What to Discard

🤔Problem?:
Significant amount of memory occupied by the KV cache becomes a bottleneck for inference and it makes LLMs a memory fraining machine!

💻Proposed solution:
The research paper proposes a novel approach called CORM, which optimizes the KV cache and reduces its memory footprint. This is achieved by dynamically retaining important key-value pairs for inference without finetuning the model. CORM takes advantage of the high similarity between adjacent tokens' query vectors and the fact that current query's attention calculation can rely on a small portion of the preceding queries. This allows for a more efficient use of the KV cache without compromising performance.

📊Results:
The research paper validates that CORM can reduce the inference memory usage of KV cache by up to 70% without noticeable performance degradation across six tasks in LongBench. This demonstrates a significant improvement in memory usage and computational efficiency for LLMs during inference.

KGValidator: A Framework for Automatic Validation of Knowledge Graph Construction

🤔Problem?:
The research paper aims to address the challenge of validating information in knowledge graphs (KGs) at a large scale and high cost.

💻Proposed solution:
The proposed solution is to use Large Language Models (LLMs) as a generative agent for human-in-the-loop validation of KGs. This is made possible by recent developments in open-source tools for structural and semantic validation of LLM outputs, and by utilizing flexible approaches to fact checking and verification. The framework also allows for the incorporation of external knowledge sources for additional verification.

From Complex to Simple: Enhancing Multi-Constraint Complex Instruction Following Ability of Large Language Models

🤔Problem?:
The research paper addresses the issue of enhancing the ability of LLMs to follow complex instructions with multiple constraints.

💻Proposed solution:
Paper studyies what training data is effective in enhancing the LLM's understanding of complex instructions. They found that training LLMs with instructions containing multiple constraints can improve their ability to follow complex instructions, even for out-of-domain constraints. They also propose methods for obtaining and utilizing the effective training data. The training data and methods are then tested through extensive experiments to prove their effectiveness in terms of overall performance, training efficiency, and generalization abilities.

Benchmarks/LLMs evaluation

Towards a Holistic Evaluation of LLMs on Factual Knowledge Recall

The PRISM Alignment Project: What Participatory, Representative and Individualised Human Feedback Reveals About the Subjective and Multicultural Alignment of Large Language Models

📚Want to learn more, Survey paper

Act as a Honeytoken Generator! An Investigation into Honeytoken Generation with Large Language Models

The research paper proposes to solve the problem of scalability by utilizing Large Language Models (LLMs) to create a variety of honeytokens. This approach involves systematically investigating different prompt structures and building blocks to generate honeytokens of different types, such as configuration files, databases, and log files. The researchers also tested the performance of these honeytokens across different state-of-the-art LLMs to assess their effectiveness.

🧯Let’s make LLMs safe!!

Investigating the prompt leakage effect and black-box defenses for multi-turn LLM interactions

The research paper proposes a unique multi-turn threat model that leverages the LLM's sycophancy effect to analyze and dissect task instruction and knowledge leakage in the LLM responses. This model is used to measure the average attack success rate (ASR) in a multi-turn setting, which is found to be 86.2%, with some LLMs like GPT-4 and claude-1.3 leaking up to 99%. To mitigate this, the paper suggests implementing 6 black-box defense strategies, including a query-rewriter in the RAG scenario.

🌈 Creative ways to use LLMs!! (Applications)

Chat2Scenario: Scenario Extraction From Dataset Through Utilization of Large Language Model [GitHub: https://github.com/ftgTUGraz/Chat2Scenario]

Chat2Scenario uses LLMs which allows for the efficient extraction of driving scenarios from descriptive texts of driving conditions, using specified criticality metric thresholds. The extracted scenarios are then converted into ASAM OpenSCENARIO and IPG CarMaker text files, streamlining the scenario extraction process and enhancing efficiency.

Domain-Specific Improvement on Psychotherapy Chatbot Using Assistant

From Local to Global: A Graph RAG Approach to Query-Focused Summarization

The research paper proposes a Graph RAG (Retrieval-Augmented Generation) approach, which combines the strengths of two contrasting methods - retrieval-augmented generation and query-focused summarization. This approach uses a large language model to build a graph-based text index in two stages. First, an entity knowledge graph is derived from the source documents, and then community summaries are pregenerated for closely-related entities.

Classifying Human-Generated and AI-Generated Election Claims in Social Media

Telco-RAG: Navigating the Challenges of Retrieval-Augmented Language Models for Telecommunications

The research paper proposes Telco-RAG, a customized RAG framework specifically designed for handling the complexities of telecom standard documents, particularly 3rd Generation Partnership Project (3GPP) documents. Telco-RAG addresses the challenges of implementing a RAG pipeline on highly technical content by providing guidelines and offering a solution for applying LLMs in telecommunications.

Semantic Routing for Enhanced Performance of LLM-Assisted Intent-Based 5G Core Network Management and Orchestration

🤖LLMs for robotics

The Feasibility of Implementing Large-Scale Transformers on Multi-FPGA Platforms 🔥🔥

The research paper addresses the lack of discussion and exploration of using FPGAs in large machine learning applications, despite evidence showing their potential for efficient and high-performance computation. The research paper proposes to solve this problem by developing a scalable multi-FPGA platform and tools to map large applications to the platform. This platform would allow for the implementation and deployment of multi-FPGA applications, which currently lacks a commonly-accepted flow. The platform and tools would allow for the exploration of using multiple FPGAs in large machine learning applications, specifically for large transformers. The platform and tools were validated by designing a multi-FPGA version of the I-BERT transformer and implementing one encoder using six FPGAs as a proof-of-concept. This shows that with the right infrastructure and tools, it is feasible to explore the benefits of using FPGAs for large machine learning applications.

Reply

or to participate.