Quickly access the latest research papers on large language models from arXiv.
Papers and research related to Large Language Model
The BALROG benchmark evaluates the reasoning capabilities of Large Language Models (LLMs) and Vision Language Models (VLMs) in complex games, revealing significant performance gaps in challenging tasks and vision-based decision-making.
The AdaptAgent framework enhances multimodal web agents' adaptability to new websites using few-shot learning from human demonstrations, significantly improving task success rates in various benchmarks.
This research evaluates the deployment of a physician-supervised LLM-based conversational agent, Mo, in a medical advice chat service, demonstrating improved patient experience and safety in healthcare communication.
SpecTool is introduced as a benchmark for identifying error patterns in Large Language Models (LLMs) during tool-use tasks, providing insights to improve system performance by analyzing LLM outputs.
The study evaluates Sporo AraSum, an Arabic language model designed for clinical documentation, demonstrating its superiority over JAIS in accuracy and cultural sensitivity for healthcare applications.
This research investigates the use of large language models (LLMs) to create synthetic datasets for evaluating product desirability, demonstrating effective sentiment alignment and cost-efficient data generation methods.
The paper presents LIMBA, an open-source framework aimed at preserving low-resource languages through generative models, using Sardinian as a case study to demonstrate its potential in language revitalization.
The paper presents SynEHRgy, a method for generating synthetic Electronic Health Records (EHRs) using a novel tokenization strategy and a decoder-only transformer model, enhancing data utility and privacy.
This paper explores the unification of Balti dialects through the application of Large Language Models (LLMs) and AI technology, emphasizing the importance of standardizing this endangered language amidst cultural diversity.
This research paper investigates the memorization of bug benchmarks by Large Language Models (LLMs), revealing concerns about data leakage affecting the evaluation of model performance in software engineering tasks.
This paper presents a framework called spec2code that integrates Large Language Models (LLMs) with formal verification to automate the generation of embedded automotive software, demonstrating feasibility through industrial case studies.
FASTNav introduces a method for enhancing small language models (SLMs) to improve robot navigation, leveraging edge computing for efficient local deployment and interaction.
This paper explores how large language models (LLMs) engage in existential conversations, discussing topics like consciousness and the role of AI in society, while examining cultural influences and community impacts.
This research paper investigates the information security awareness (ISA) of large language models (LLMs), revealing significant variations in their ability to provide safe responses based on user prompts and system settings.
This research paper explores how Large Language Models (LLMs) can generate content that maximizes user engagement on social networks, utilizing reinforcement learning and an engagement model based on opinion dynamics.
This survey paper explores how human and large language model (LLM) feedback can enhance reinforcement learning (RL) in complex environments, addressing challenges like sample inefficiency and the curse of dimensionality.
The paper presents ALIGN, a compositional LLM-based system designed for automated medical coding, addressing interoperability issues in historical clinical trial data to enhance research and drug development.
The paper 'LLMSteer' presents a novel framework aimed at improving long-context inference in large language models (LLMs) by implementing query-independent attention steering, significantly enhancing efficiency and performance.
This research paper evaluates the capabilities of large language models (LLMs) in understanding social dynamics, particularly in the context of social media interactions involving bullying and anti-bullying messages.
MERLOT is a novel mixture-of-experts framework utilizing distilled large language models for efficient encrypted traffic classification, achieving high accuracy while minimizing computational costs.
The paper presents LEDRO, a novel framework that leverages Large Language Models (LLMs) for optimizing analog circuit design, significantly improving efficiency and generalizability compared to traditional methods.
The paper 'Selective Attention: Enhancing Transformer through Principled Context Control' introduces Selective Self-Attention (SSA), a method that improves the attention mechanism in transformers by controlling contextual sparsity and relevance through temperature scaling.
The paper 'Reward Modeling with Ordinal Feedback: Wisdom of the Crowd' presents a framework for learning reward models from ordinal feedback, enhancing the alignment of large language models by capturing more nuanced human preferences.
This research presents an innovative on-board Vision-Language Model (VLM) framework for personalized motion control in autonomous vehicles, enhancing user experience through adaptive driving behavior based on individual preferences.
The paper presents ACING, a novel approach for optimizing prompts in black-box Large Language Models (LLMs) using reinforcement learning, significantly improving instruction effectiveness across various tasks.
The paper presents CATCH, a novel approach to mitigate hallucinations in Large Vision-Language Models (LVLMs) by enhancing visual processing and reducing misalignment issues, crucial for applications in sensitive fields like healthcare.
This research investigates the effectiveness of advanced Large Language Models (LLMs) in multi-class disease classification, focusing on four specific diseases and comparing models like BioBERT, XLNet, and Last-BERT.
This research paper investigates backdoor attacks in Large Language Models (LLMs) by analyzing model-generated explanations, revealing how these attacks manipulate model behavior and offering insights for enhancing LLM security.
This research presents a novel framework for real-time translation between American Sign Language (ASL) and Indian Sign Language (ISL) using Large Language Models (LLMs), enhancing accessibility for sign language users.
This paper introduces the Visual Inference Chain (VIC) framework to enhance multimodal large language models (MLLMs) by reducing visual hallucinations and improving reasoning accuracy in visual-language tasks.
The paper presents TRIDENT, a novel framework utilizing Multimodal Large Language Model (MLLM) embeddings and attribute smoothing for compositional zero-shot learning (CZSL), addressing limitations in existing models and achieving state-of-the-art performance.
This research paper investigates the application of Large Language Models (LLMs) in combinatorial optimization, specifically for optimizing the sequencing of Design Structure Matrix (DSM) in engineering contexts, demonstrating improved performance through contextual knowledge integration.
This research paper introduces Additional Logic Training (ALT) to enhance the reasoning capabilities of large language models (LLMs) through a synthetic logic corpus, demonstrating significant improvements in logical reasoning and related benchmarks.
This paper introduces a benchmark for evaluating the prompt steerability of Large Language Models (LLMs), assessing their ability to reflect diverse personas and value systems through tailored prompting techniques.
This research paper investigates how ambiguity in natural language affects Large Language Models (LLMs) in open-domain question answering, proposing token-level disambiguation strategies to enhance performance.
This paper presents a layered architecture for developing Large Language Model (LLM)-based software systems, addressing the need for enhanced capabilities beyond basic tasks through systematic implementation and technology selection.
This scoping review analyzes generative AI models for creating synthetic health records, focusing on medical text, time series, and longitudinal data, highlighting key research objectives and methodologies.
This paper explores the retrieval problem, a reasoning task solvable by transformers with a minimum number of layers, revealing insights into attention mechanisms and the emergence of attention heads during training.
The paper 'LLM4DS' evaluates the performance of leading Large Language Models (LLMs) in generating data science code, revealing their potential and limitations across various coding challenges.
This research investigates the influence of moral persuasion on Large Language Models (LLMs), assessing their susceptibility to ethical alignment through two experiments involving morally ambiguous scenarios and predefined ethical frameworks.
FedCoLLM introduces a parameter-efficient federated co-tuning framework that enhances both Large Language Models (LLMs) and Small Language Models (SLMs) by facilitating knowledge transfer while maintaining data privacy.
This technical report explores enhancing the reasoning capabilities of Large Language Models (LLMs) using reward-guided tree search algorithms, focusing on mathematical reasoning tasks across four challenging datasets.
The paper introduces the Extended Influence Function for Contrastive Loss (ECIF), addressing misalignment issues in multimodal large language models (MLLMs) caused by unreliable data sources, enhancing model interpretability and robustness.
This paper presents a topology-aware preemptive scheduling method for co-located large language model (LLM) workloads, enhancing resource allocation efficiency and improving performance by 55%.
This paper presents a novel approach to mitigate hallucinations in Large Language Models (LLMs) by integrating Knowledge Graphs (KGs) as an additional modality, enhancing factual accuracy without external retrieval.
This research paper explores the application of large language models (LLMs) to predictive tasks in relational databases, demonstrating their competitive performance despite the complexity of relational data structures.
The paper presents LLM-IE, a Python package designed for generative information extraction using large language models, addressing challenges in prompt engineering and providing a complete extraction pipeline for biomedical applications.
The paper 'BitMoD: Bit-serial Mixture-of-Datatype LLM Acceleration' presents an innovative algorithm-hardware co-design solution to enhance the efficiency of large language models (LLMs) by reducing their memory footprint while maintaining high accuracy.
The paper 'The Dark Side of Trust' explores vulnerabilities in large language models (LLMs) due to their bias toward authority, which can be exploited for jailbreak attacks, and proposes a defense strategy to mitigate these risks.
The paper presents the Chronos model, a zero-shot load forecasting approach leveraging pre-trained language models to achieve accurate predictions in data-scarce scenarios without extensive training.
The paper 'Steering Language Model Refusal with Sparse Autoencoders' investigates methods to enhance language model safety by steering model activations at inference time, rather than updating weights, to improve refusal behavior against unsafe prompts.