EVALUATION OF ADAPTATION METHODS FOR THE GPT-4O LANGUAGE MODEL IN EDUCATIONAL ENVIRONMENTS

Authors

  • V. Didok Харківський національний університет міського господарства імені О.М. Бекетова
  • M. Pan Харківський національний університет міського господарства імені О.М. Бекетова

DOI:

https://doi.org/10.33042/2522-1809-2024-6-187-12-17

Keywords:

GPT-4O, Fine-Tuning, Prompt Engineering, Retrieval-Augmented Generation, agents, adaptation, education

Abstract

The adaptation of large language models, such as GPT-4O, for educational settings is becoming increasingly relevant, given their potential to improve automated support in learning environments. This study focuses on evaluating five distinct adaptation methods: the standard model (without adaptation), Fine-Tuning, Prompt Engineering, Retrieval-Augmented Generation (RAG), and agent-based systems. These methods were compared to find the most effective for enhancing GPT-4O’s performance in educational tasks. The analysis was conducted using key evaluation metrics such as precision, accuracy, F1-Score, average tokens per response, hallucination rate, and a final comprehensive score.
The experimental results indicate that Fine-Tuning, which involves additional training on domain-specific educational data, offers the most significant improvements in terms of accuracy and reduction of erroneous responses (hallucinations). Fine-Tuning is especially effective in educational tasks requiring high accuracy and contextual understanding.
Retrieval-Augmented Generation showed promising results by leveraging external data to enhance accuracy and lower hallucinations, making it suitable for tasks needing up-to-date information. Prompt Engineering provided faster response times but had more inaccuracies due to reliance on optimal query formulation without retraining.
Agent-based systems excelled in handling complex tasks, though they showed a slight increase in hallucination rates due to their dynamic nature. The baseline performance of the standard GPT-4O model highlighted its limitations, like reduced accuracy and higher hallucination rates, especially in educational contexts.
These findings underscore that Fine-Tuning is the most effective adaptation method for educational tasks, offering substantial improvements in accuracy and reliability. Overall, this research highlights the necessity of selecting the appropriate adaptation method based on the specific requirements of educational tasks. The study contributes to the ongoing optimization of large language models for use in educational environments, ensuring that responses are both reliable and contextually relevant.

Author Biographies

V. Didok, Харківський національний університет міського господарства імені О.М. Бекетова

здобувач вищої освіти 2-го курсу магістратури навчально-наукового інституту енергетичної, інформаційної та транспортної інфраструктури

M. Pan, Харківський національний університет міського господарства імені О.М. Бекетова

кандидат технічних наук, доцент кафедри комп’ютерних наук та інформаційних технологій

References

Investigating the Performance of Retrieval-Augmented Generation and Fine-Tuning. – [arXiv preprint] 2024. Available at: https://arxiv.org/abs/2403.09727

Prompt Engineering vs Fine-Tuning vs RAG. – [MyScale] 2024. Available at: https://myscale.com/blog/prompt-engineering-vs-finetuning-vs-rag/

Prompt Engineering or Fine Tuning: An Empirical Assessment of LLMs. – [arXiv preprint] 2023. Available at: https://arxiv.org/abs/2310.10508

Retrieval-Augmented Generation for Large Language Models: A Survey. – [arXiv preprint] 2023. Available at: https://arxiv.org/abs/2312.10997

Fine-Tuning GPT-4 Models. – [OpenAI] 2024. Available at: https://openai.com/index/gpt-4o-fine-tuning/

An Introduction to Large Language Models, Prompt Engineering, and P-Tuning. – [NVIDIA Developer Blog] 2023. Available at: https://developer.nvidia.com/blog/an-introduction-to-large-language-models-prompt-engineering-and-p-tuning

Published

2024-12-17

How to Cite

Didok, V., & Pan, M. (2024). EVALUATION OF ADAPTATION METHODS FOR THE GPT-4O LANGUAGE MODEL IN EDUCATIONAL ENVIRONMENTS. Municipal Economy of Cities, 6(187), 12–17. https://doi.org/10.33042/2522-1809-2024-6-187-12-17

Issue

Section

статьи