EVALUATION OF ADAPTATION METHODS FOR THE GPT-4O LANGUAGE MODEL IN EDUCATIONAL ENVIRONMENTS
DOI:
https://doi.org/10.33042/2522-1809-2024-6-187-12-17Keywords:
GPT-4O, Fine-Tuning, Prompt Engineering, Retrieval-Augmented Generation, agents, adaptation, educationAbstract
The adaptation of large language models, such as GPT-4O, for educational settings is becoming increasingly relevant, given their potential to improve automated support in learning environments. This study focuses on evaluating five distinct adaptation methods: the standard model (without adaptation), Fine-Tuning, Prompt Engineering, Retrieval-Augmented Generation (RAG), and agent-based systems. These methods were compared to find the most effective for enhancing GPT-4O’s performance in educational tasks. The analysis was conducted using key evaluation metrics such as precision, accuracy, F1-Score, average tokens per response, hallucination rate, and a final comprehensive score.
The experimental results indicate that Fine-Tuning, which involves additional training on domain-specific educational data, offers the most significant improvements in terms of accuracy and reduction of erroneous responses (hallucinations). Fine-Tuning is especially effective in educational tasks requiring high accuracy and contextual understanding.
Retrieval-Augmented Generation showed promising results by leveraging external data to enhance accuracy and lower hallucinations, making it suitable for tasks needing up-to-date information. Prompt Engineering provided faster response times but had more inaccuracies due to reliance on optimal query formulation without retraining.
Agent-based systems excelled in handling complex tasks, though they showed a slight increase in hallucination rates due to their dynamic nature. The baseline performance of the standard GPT-4O model highlighted its limitations, like reduced accuracy and higher hallucination rates, especially in educational contexts.
These findings underscore that Fine-Tuning is the most effective adaptation method for educational tasks, offering substantial improvements in accuracy and reliability. Overall, this research highlights the necessity of selecting the appropriate adaptation method based on the specific requirements of educational tasks. The study contributes to the ongoing optimization of large language models for use in educational environments, ensuring that responses are both reliable and contextually relevant.
References
Investigating the Performance of Retrieval-Augmented Generation and Fine-Tuning. – [arXiv preprint] 2024. Available at: https://arxiv.org/abs/2403.09727
Prompt Engineering vs Fine-Tuning vs RAG. – [MyScale] 2024. Available at: https://myscale.com/blog/prompt-engineering-vs-finetuning-vs-rag/
Prompt Engineering or Fine Tuning: An Empirical Assessment of LLMs. – [arXiv preprint] 2023. Available at: https://arxiv.org/abs/2310.10508
Retrieval-Augmented Generation for Large Language Models: A Survey. – [arXiv preprint] 2023. Available at: https://arxiv.org/abs/2312.10997
Fine-Tuning GPT-4 Models. – [OpenAI] 2024. Available at: https://openai.com/index/gpt-4o-fine-tuning/
An Introduction to Large Language Models, Prompt Engineering, and P-Tuning. – [NVIDIA Developer Blog] 2023. Available at: https://developer.nvidia.com/blog/an-introduction-to-large-language-models-prompt-engineering-and-p-tuning
Downloads
Published
How to Cite
Issue
Section
License
The authors who publish in this collection agree with the following terms:
• The authors reserve the right to authorship of their work and give the magazine the right to first publish this work under the terms of license CC BY-NC-ND 4.0 (with the Designation of Authorship - Non-Commercial - Without Derivatives 4.0 International), which allows others to freely distribute the published work with a mandatory reference to the authors of the original work and the first publication of the work in this magazine.
• Authors have the right to make independent extra-exclusive work agreements in the form in which they were published by this magazine (for example, posting work in an electronic repository of an institution or publishing as part of a monograph), provided that the link to the first publication of the work in this journal is maintained. .
• Journal policy allows and encourages the publication of manuscripts on the Internet (for example, in institutions' repositories or on personal websites), both before the publication of this manuscript and during its editorial work, as it contributes to the emergence of productive scientific discussion and positively affects the efficiency and dynamics of the citation of the published work (see The Effect of Open Access).