Human vs. Machine Feedback: Evaluating ChatGPT-4 in the Assessment of Secondary EFL Learners’ Writing
Main Article Content
Abstract
This mixed-methods study compared ChatGPT-4 generated feedback with teacher feedback in assessing secondary school EFL learners’ writing. Conducted in a state secondary school in Istanbul, Türkiye, the study involved 53 fifth-grade students who completed weekly writing tasks over six weeks. One class received teacher feedback, while two classes received feedback from ChatGPT-4. Students revised their texts based on the feedback, and changes in writing performance were analysed quantitatively. In addition, semi-structured interviews with 45 students were conducted to explore their feedback preferences and perceptions. The quantitative findings showed that both feedback types supported improvement in students’ writing, although teacher feedback produced more consistent gains across the six-week period. A strong positive correlation was found between ChatGPT-4 and teacher scores in both pre- and post-feedback assessments, suggesting that AI-generated scoring aligned closely with human evaluation. However, the qualitative findings revealed that students generally preferred teacher feedback, describing it as more personal, motivating, and easier to trust. ChatGPT-4 feedback was appreciated for its speed, clarity, and accessibility, but was also seen as less detailed and less emotionally engaging. The findings suggest that generative AI can serve as a useful formative feedback tool in EFL writing instruction, but it does not replace the pedagogical and relational strengths of teacher feedback. A hybrid model that combines the efficiency of AI with the contextual sensitivity of human feedback may offer the most effective approach for supporting writing development in secondary language classrooms.
Article Details

This work is licensed under a Creative Commons Attribution 4.0 International License.
References
Alawad, E. A., & Hamid, F. A. (2025). An interdisciplinary review and synthesis of applied linguistics in translation studies: Bridging gaps and advancing research. Architecture Image Studies, 6(3), 1246–1257. https://doi.org/10.62754/ais.v6i3.436
Alzahrani, N. (2026). AI-First Critique Learning (AFCL): A framework for restoring assessment integrity in the age of Generative AI. International Journal of AI in Pedagogy, Innovation, and Learning Futures, 2026(1), 1–23. https://doi.org/10.46787/ijaipil.v2026i1.6945
Agustini, N. P. O. (2023). Examining the role of ChatGPT as a learning tool in promoting students’ English language learning autonomy relevant to Kurikulum Merdeka Belajar. Edukasia: Jurnal Pendidikan Dan Pembelajaran, 4(2), 921–934. https://doi.org/10.62775/edukasia.v4i2.373
Barrot, J. S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing, 57, 100745. https://doi.org/10.1016/j.asw.2023.100745
Banihashem, S. K., Kerman, N. T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education, 21(1), 23. https://doi.org/10.1186/s41239-024-00455-4
Banihashem, S. K., Bond, M., Bergdahl, N., Khosravi, H., & Noroozi, O. (2025). A systematic mapping review at the intersection of artificial intelligence and self-regulated learning. International Journal of Educational Technology in Higher Education, 22(1), 50. https://doi.org/10.1186/s41239-025-00548-8
Baskara, F. R. (2023). Integrating ChatGPT into EFL writing instruction: Benefits and challenges. International Journal of Education and Learning, 5(1), 44–55. https://doi.org/10.31763/ijele.v5i1.858
Bewersdorff, A., Zhai, X., Roberts, J., & Nerdel, C. (2023). Myths, mis- and pre-conceptions of artificial intelligence: A review of the literature. Computers and Education: Artificial Intelligence, 4, 100143. https://doi.org/10.1016/j.caeai.2023.100143
Bitchener, J. (2008). Evidence in support of written corrective feedback. Journal of Second Language Writing, 17(2), 102–118. https://doi.org/10.1016/j.jslw.2007.11.004
Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in second language acquisition and writing. Routledge. https://doi.org/10.4324/9780203832400
Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing? A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3), 100068. https://doi.org/10.1016/j.rmal.2023.100068
Chan, C. K. Y. (2023). A comprehensive AI policy education framework for university teaching and learning. International Journal of Educational Technology in Higher Education, 20, 38. https://doi.org/10.1186/s41239-023-00408-3
Cotton, D. R. E., Cotton, P. A., & Shipway, J. R. (2024). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International, 61(2), 228–239. https://doi.org/10.1080/14703297.2023.2190148
Creswell, J. W., & Plano Clark, V. L. (2011). Designing and conducting mixed methods research (2nd ed.). SAGE.
Crosthwaite, P., Ningrum, S., & Lee, I. (2022). Research trends in L2 written corrective feedback: A bibliometric analysis of three decades of Scopus-indexed research on L2 WCF. Journal of Second Language Writing, 58, 100934. https://doi.org/10.1016/j.jslw.2022.100934
Dai, W., Lin, J., Jin, H., Li, T., Tsai, Y.-S., Gašević, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. In Proceedings of the 2023 IEEE International Conference on Advanced Learning Technologies (ICALT) (pp. 323–325). IEEE. https://doi.org/10.1109/ICALT58122.2023.00100
Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20, 57. https://doi.org/10.1186/s41239-023-00425-2
Evmenova, A. S., Regan, K., Mergen, R., & Hrisseh, R. (2024). Improving writing feedback for struggling writers: Generative AI to the rescue? TechTrends, 68(4), 790–802. https://doi.org/10.1007/s11528-024-00965-y
Farrokhnia, M., Latifi, S., Papadopoulos, P. M., Hogenkamp, L., Gijlers, H., Khosravi, H., & Noroozi, O. (2026). Generative AI offers more, but students revise less: Comparing the effects of teacher and AI feedback on student essay revisions. International Journal of Educational Technology in Higher Education, 23(1), 6. https://doi.org/10.1186/s41239-026-00579-9
Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), 6. https://doi.org/10.3390/soc15010006
Gunes, A., & Liman Kaban, A. (2025). A Delphi study on ethical challenges and ensuring academic integrity regarding AI research in higher education. Higher Education Quarterly, 79(4), e70057. https://doi.org/10.1111/hequ.70057
Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8047–8073. https://doi.org/10.1007/s10639-023-12146-0
Guo, K., Pan, M., Li, Y., & Lai, C. (2024). Effects of an AI-supported approach to peer feedback on university EFL students’ feedback quality and writing ability. The Internet and Higher Education, 63, 100962. https://doi.org/10.1016/j.iheduc.2024.100962
Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. https://doi.org/10.3102/003465430298487
Johnson, R. B., & Onwuegbuzie, A. J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33(7), 14–26. https://doi.org/10.3102/0013189X033007014
Kang, E., & Han, Z. (2015). The efficacy of written corrective feedback in improving L2 written accuracy: A meta‐analysis. The Modern Language Journal, 99(1), 1–18. https://doi.org/10.1111/modl.12189
Kohnke, L., Moorhouse, B. L., & Zou, D. (2023). ChatGPT for language teaching and learning. RELC Journal, 54(2), 537–550. https://doi.org/10.1177/00336882231162868
Kostka, I., & Toncelli, R. (2023). Exploring applications of ChatGPT to English language teaching: Opportunities, challenges, and recommendations. TESL-EJ, 27(3), n3. https://doi.org/10.55593/ej.27107int
Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X. -H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv. https://arxiv.org/abs/2506.08872
Lam, C., & Wolfe, J. (2023). An introduction to quasi-experimental research for technical and professional communication instructors. Journal of Business and Technical Communication, 37(2), 174–193. https://doi.org/10.1177/10506519221143111
Mizumoto, A., Shintani, N., Sasaki, M., & Teng, M. F. (2024). Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics, 3(2), 100116. https://doi.org/10.1016/j.rmal.2024.100116
Moorhouse, B. L., Wong, K. M., & Li, L. (2023). Teaching with technology in the post-pandemic digital age: Technological normalisation and AI-induced disruptions. RELC Journal, 54(2), 311–320. https://doi.org/10.1177/00336882231176929
OpenAI. (2023). GPT-4 technical report. https://arxiv.org/abs/2303.08774
Pfau, A., Polio, C., & Xu, Y. (2023). Exploring the potential of ChatGPT in assessing L2 writing accuracy for research purposes. Research Methods in Applied Linguistics, 2(3), 100083. https://doi.org/10.1016/j.rmal.2023.100083
Rassaei, E., & Jabbarpoor, S. (2025). Effects of textual enhancement on L2 development: A meta-analysis. International Review of Applied Linguistics in Language Teaching. https://doi.org/10.1515/iral-2025-0118
Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching, 6(1), 342–363. https://doi.org/10.37074/jalt.2023.6.1.9
Sharples, M. (2025). A systems approach to AI and education in a post-digital world. Theory Into Practice, 64(4), 483–491. https://doi.org/10.1080/00405841.2025.2528549
Shaw, S. D., & Nave, G. (2026). Thinking-fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender. SSRN. https://doi.org/10.2139/ssrn.6097646
Shi, H., & Aryadoust, V. (2024). A systematic review of AI-based automated written feedback research. ReCALL, 36(2), 187–209. https://doi.org/10.1017/S0958344023000265
Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894
Stewart, A. E., Rao, A., Michaels, A., Sun, C., Duran, N. D., Shute, V. J., & D’Mello, S. K. (2023, June). CPSCoach: The design and implementation of intelligent collaborative problem solving feedback. In N. Wang, G. Rebolledo-Mendez, N. Matsuda, O. C. Santos, & V. Dimitrova. (Eds.), International conference on artificial intelligence in education (pp. 695–700). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-36272-9_58
Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752
Su, J., Guo, K., Chen, X., & Chu, S. K. W. (2024). Teaching artificial intelligence in K-12 classrooms: A scoping review. Interactive Learning Environments, 32(9), 5207–5226. https://doi.org/10.1080/10494820.2023.2212706
Teng, M. F. (2024). “ChatGPT is the companion, not enemies”: EFL learners’ perceptions and experiences in using ChatGPT for feedback in writing. Computers and Education: Artificial Intelligence, 7, 100270. https://doi.org/10.1016/j.caeai.2024.100270
Ting, L., Singh, C. K. S., Kiong, T. T., Rakhimjonov, F., & Singh, T. S. M. (2025). Mentor or examiner? A critical discourse analysis of AI-generated feedback in EFL writing education. International Journal of Academic Research in Progressive Education & Development, 14(4), 664–684. https://doi.org/10.6007/IJARPED/v14-i4/26651
Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2), 327–369. https://doi.org/10.1111/j.1467-1770.1996.tb01238.x
von Garrel, J., & Mayer, J. (2023). Artificial intelligence in studies-use of ChatGPT and AI-based tools among students in Germany. Humanities and Social Sciences Communications, 10, Article 1. https://doi.org/10.1057/s41599-023-02304-7
Wang, W. S., Lin, C. J., Lee, H. Y., Huang, Y. M., & Wu, T. T. (2025). Integrating feedback mechanisms and ChatGPT for VR-based experiential learning: Impacts on reflective thinking and AIoT physical hands-on tasks. Interactive Learning Environments, 33(2), 1770–1787.
Xiao, F., Zhu, S., & Xin, W. (2026). Exploring the landscape of generative AI (ChatGPT)-powered writing instruction in English as a foreign language education: A scoping review. ECNU Review of Education, 9(1), 1–19. https://doi.org/10.1177/20965311241310881
Yamashita, T. (2025). Exploring potential biases in GPT-4o’s ratings of English language learners’ essays. Language Testing, 42(3), 344–358. https://doi.org/10.1177/02655322251329435
Yoon, S. Y., Miszoglad, E., & Pierce, L. R. (2023). Evaluation of ChatGPT feedback on ELL writers’ coherence and cohesion. arXiv:2310.06505. arXiv Preprint.
Zhan, Y., & Yan, Z. (2025). Students’ engagement with ChatGPT feedback: Implications for student feedback literacy in the context of generative artificial intelligence. Assessment & Evaluation in Higher Education. https://doi.org/10.1080/02602938.2025.2471821
Zhang, J., & Wang, J. (2025). Student perceptions of hybrid feedback: Using Gen-AI to enhance engagement with EAP writing feedback. The JALT CALL Journal, 21(2), 2175–2175. https://doi.org/10.29140/jaltcall.v21n2.2175
Zheng, Y., Yu, S., & Liu, Z. (2023). Understanding individual differences in lower-proficiency students’ engagement with teacher-written corrective feedback. Teaching in Higher Education, 28(2), 301–321. https://doi.org/10.1080/13562517.2020.1806225
Zou, S., Guo, K., Wang, J., & Liu, Y. (2025). Investigating students’ uptake of teacher- and ChatGPT-generated feedback in EFL writing: A comparison study. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2024.2447279