Research papers on LLM

Quickly access the latest research papers on large language models from arXiv.

Target Information

Papers and research related to Large Language Model

Source

arXiv - Artificial Intelligence·1 day ago
LLMs Show Consistent Performance in Domain-Specific Ontology Generation
Description

This study evaluates the capability of Large Language Models (LLMs) for generating domain-specific ontologies, demonstrating their consistent performance across various domains and highlighting their potential for scalable ontology construction.

Key Points
1. The research investigates the application of LLMs, specifically DeepSeek and o1-preview, for automated ontology generation from competency questions and user stories.
2. Experiments conducted across six distinct domains reveal that both LLMs generalize well, maintaining consistent performance in ontology generation tasks.
3. Findings suggest that LLM-based approaches can facilitate scalable and domain-agnostic ontology construction, paving the way for advancements in automated reasoning and knowledge representation.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
Innovative Multi-Agent Framework Enhances LLM Judgment Evaluation
Description

This research presents a multi-agent framework for evaluating LLM judgments, addressing biases in human evaluation and enhancing the selection of LLM responses through a structured meta-judging process.

Key Points
1. The proposed three-stage meta-judge selection pipeline includes developing a rubric with GPT-4 and human experts, scoring judgments with multiple LLM agents, and filtering low scores.
2. Experimental results on the JudgeBench dataset show a 15.55% improvement over raw judgments and an 8.37% improvement compared to single-agent methods, highlighting the effectiveness of multi-agent collaboration.
3. This work emphasizes the potential of LLMs as meta-judges, paving the way for future research in preference datasets for LLM reinforcement learning applications.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
Ensemble Bayesian Inference: Small Models Achieve LLM-level Accuracy in Profile Matching
Description

This research introduces Ensemble Bayesian Inference (EBI), a method that combines small language models (SLMs) to achieve accuracy levels comparable to large language models (LLMs) in profile matching tasks.

Key Points
1. EBI utilizes Bayesian estimation to merge outputs from multiple SLMs, enhancing their performance beyond individual limitations, particularly in tasks like aptitude assessments and consumer profile analysis.
2. The study reveals that including models with negative Lift values in the ensemble can improve overall accuracy, showcasing a novel approach to model combination.
3. Experiments conducted in both Japanese and English demonstrate EBI's effectiveness, suggesting potential for high-performance AI systems with limited computational resources and the effective use of lower-performing models.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
INSIGHT: Enhancing Education with AI and Large Language Models
Description

The paper presents INSIGHT, a proof of concept that integrates AI tools, particularly Large Language Models, to enhance student-teacher interactions and personalize education in higher learning environments.

Key Points
1. INSIGHT utilizes AI to analyze student questions, extracting keywords to create a dynamic FAQ that aids teaching staff in providing tailored support.
2. The modular design of INSIGHT allows for integration into various higher education courses, enhancing the adaptability of teaching methods.
3. Future developments could leverage collected data to offer adaptive learning experiences, adjusting content based on individual student progress and learning styles.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
Innovative Framework Enhances Reliability of Large Vision-Language Models in VQA Tasks
Description

This research presents a Split Conformal Prediction (SCP) framework to mitigate hallucinations in Large Vision-Language Models (LVLMs) for Visual Question Answering (VQA), enhancing reliability in safety-critical applications.

Key Points
1. The SCP framework employs a model-agnostic method for uncertainty quantification, integrating dynamic threshold calibration and cross-modal consistency verification to improve prediction accuracy.
2. Key innovations include rigorous control of marginal coverage, dynamic adjustment of prediction set sizes, and elimination of prior distribution assumptions, ensuring robust performance across various risk levels.
3. Evaluations on benchmarks like ScienceQA and MMMU show that SCP maintains theoretical guarantees and stable performance, making it suitable for real-world applications in healthcare and autonomous systems.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
HalluLens: A New Benchmark for Evaluating Hallucinations in Large Language Models
Description

The paper presents a new benchmark for evaluating hallucinations in large language models (LLMs), addressing the challenges of inconsistent definitions and categorizations to enhance user trust in generative AI systems.

Key Points
1. The research introduces a clear taxonomy of hallucinations, distinguishing between extrinsic and intrinsic types to promote consistency in evaluation and research.
2. A dynamic test set generation method is proposed to prevent data leakage and ensure robustness in evaluating extrinsic hallucinations, which are critical as LLMs evolve.
3. The paper analyzes existing benchmarks, highlighting their limitations and the need for a unified framework to effectively assess hallucinations separate from factuality evaluations.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
Hybrid GCN and LLM Model Enhances Drug Discovery Virtual Screening Performance
Description

This research presents a hybrid model combining Graph Convolutional Networks (GCNs) with Large Language Model (LLM) embeddings to enhance virtual screening in drug discovery, achieving superior predictive performance.

Key Points
1. The proposed architecture integrates GCNs with LLM-derived embeddings, allowing for localized structural learning and global chemical knowledge, improving the model's expressiveness.
2. By concatenating LLM embeddings after each GCN layer, the model maintains computational efficiency while significantly enhancing performance, achieving an F1-score of 88.8%.
3. The hybrid model outperforms traditional methods like XGBoost and SVM, demonstrating the effectiveness of combining deep learning techniques in drug discovery applications.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
HMI: Optimizing Multi-Tenant Inference in Pretrained Language Models
Description

The paper presents HMI, a Hierarchical Knowledge Management system designed to optimize multi-tenant inference in pretrained language models, significantly reducing resource demands while maintaining performance.

Key Points
1. HMI categorizes PLM knowledge into general, domain-specific, and task-specific, enabling efficient resource management across multiple tenants.
2. The system constructs hierarchical PLMs to minimize GPU memory usage and employs parameter swapping for managing task-specific knowledge within limited resources.
3. Experimental results show HMI can serve up to 10,000 hierarchical PLMs on a single GPU with minimal accuracy loss, enhancing inference throughput and resource utilization.
linkCopy linkshare_windowsShare
arXiv - Artificial Intelligence·1 day ago
Leveraging LLM Summaries for Enhanced Topic Modeling in Python Source Code
Description

This paper presents a novel method that combines large language model (LLM) summaries with topic modeling to identify meaningful topics in Python source code, enhancing software engineering tasks.

Key Points
1. The approach utilizes LLMs to generate summaries of code, which are then analyzed for topic modeling, providing a semantic understanding of the code structure.
2. Experimental results indicate that topics derived from LLM summaries are more interpretable and semantically rich compared to those from function names or existing docstrings.
3. The method shows potential applications in automatic documentation, code search, software reorganization, and knowledge discovery within large code repositories.
linkCopy linkshare_windowsShare
arXiv - Machine Learning·1 day ago
New Method Combats Catastrophic Forgetting in Streaming Language Models
Description

The paper 'Replay to Remember' addresses catastrophic forgetting in large language models (LLMs) during continual learning, proposing a lightweight method that combines LoRA and minimal replay for real-time domain adaptation across various fields.

Key Points
1. The study explores the challenge of catastrophic forgetting in LLMs, where new data exposure leads to loss of previously learned knowledge, particularly in resource-constrained environments.
2. A novel approach is introduced that integrates Low-Rank Adaptation (LoRA) with a minimal replay mechanism, demonstrating effectiveness in medical, genetics, and legal domains.
3. Experiments show that while forgetting occurs, even minimal replay can stabilize and partially restore domain-specific knowledge, providing practical insights for deploying adaptable LLMs in real-world applications.
linkCopy linkshare_windowsShare
arXiv - Machine Learning·1 day ago
New Framework Enhances Adversarial Robustness Measurement for Large Language Models
Description

This paper presents a framework for measuring the adversarial robustness of Large Language Models (LLMs), addressing their vulnerability to adversarial inputs and the need for task-specific evaluations.

Key Points
1. The Robustness Measurement and Assessment (RoMA) framework quantifies LLM resilience against adversarial inputs without needing model parameters, ensuring computational efficiency.
2. Empirical evaluations show significant variability in robustness across different models and tasks, highlighting the necessity for tailored robustness assessments.
3. The study emphasizes the importance of developing reliable LLMs for high-stakes applications, providing a systematic methodology for practitioners to evaluate model robustness effectively.
linkCopy linkshare_windowsShare
arXiv - Machine Learning·1 day ago
Collaborative Approaches: Merging Large and Small Models for Enhanced AI Efficiency
Description

This position paper discusses the collaborative potential of large language models (LLMs) and smaller models (SMs) to enhance domain-specific tasks, advocating for a synergistic approach to model adaptation and efficiency.

Key Points
1. LLMs excel in capabilities but demand extensive data and resources, while SMs offer efficiency and domain specificity, making collaboration beneficial.
2. The paper explores strategies for integrating LLMs and SMs, highlighting challenges and opportunities in their collaborative use for private domain adaptation.
3. The authors call for industry-driven research focused on multi-objective benchmarks using real-world datasets to advance the practical application of these collaborative models.
linkCopy linkshare_windowsShare
arXiv - Machine Learning·1 day ago
NeuralGrok: A New Approach to Accelerate Generalization in Transformers
Description

NeuralGrok introduces a gradient-based method to enhance the generalization of transformers in arithmetic tasks, addressing the grokking phenomenon by optimizing gradient transformations for improved model performance.

Key Points
1. NeuralGrok employs an auxiliary module that adjusts gradient components dynamically, enhancing generalization through a bilevel optimization algorithm.
2. The method stabilizes training and reduces model complexity, contrasting with traditional regularization techniques that may destabilize performance.
3. A novel metric, Absolute Gradient Entropy (AGE), is introduced to analyze model complexity, demonstrating how NeuralGrok facilitates better generalization in transformer models.
linkCopy linkshare_windowsShare

Follow this feed

Research papers on LLM

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points

Description

Key Points