Human vs. Machine Feedback: Evaluating ChatGPT-4 in the Assessment of Secondary EFL Learners’ Writing

Yasemin Gok Acan; Aysegul Liman Kaban

doi:10.21315/apjee2026.41.1.8

PDF

Published: Jun 9, 2026

DOI: https://doi.org/10.21315/apjee2026.41.1.8

Keywords:

AI-generated feedback, ChatGPT-4, EFL writing instruction, formative assessment, human– machine collaboration

Yasemin Gok Acan

Computer Education and Instructional Technologies Department, Bahcesehir University, 34349 Istanbul, Turkiye

https://orcid.org/0009-0009-9939-5738

Aysegul Liman Kaban

STEM Education Department, Mary Immaculate College, V94 VN26 Limerick, Ireland

https://orcid.org/0000-0003-3813-2888

Abstract

This mixed-methods study compared ChatGPT-4 generated feedback with teacher feedback in assessing secondary school EFL learners’ writing. Conducted in a state secondary school in Istanbul, Türkiye, the study involved 53 fifth-grade students who completed weekly writing tasks over six weeks. One class received teacher feedback, while two classes received feedback from ChatGPT-4. Students revised their texts based on the feedback, and changes in writing performance were analysed quantitatively. In addition, semi-structured interviews with 45 students were conducted to explore their feedback preferences and perceptions. The quantitative findings showed that both feedback types supported improvement in students’ writing, although teacher feedback produced more consistent gains across the six-week period. A strong positive correlation was found between ChatGPT-4 and teacher scores in both pre- and post-feedback assessments, suggesting that AI-generated scoring aligned closely with human evaluation. However, the qualitative findings revealed that students generally preferred teacher feedback, describing it as more personal, motivating, and easier to trust. ChatGPT-4 feedback was appreciated for its speed, clarity, and accessibility, but was also seen as less detailed and less emotionally engaging. The findings suggest that generative AI can serve as a useful formative feedback tool in EFL writing instruction, but it does not replace the pedagogical and relational strengths of teacher feedback. A hybrid model that combines the efficiency of AI with the contextual sensitivity of human feedback may offer the most effective approach for supporting writing development in secondary language classrooms.

Issue

Vol. 41 No. 1 (2026)

Section

Articles

This work is licensed under a Creative Commons Attribution 4.0 International License.

References

Alawad, E. A., & Hamid, F. A. (2025). An interdisciplinary review and synthesis of applied linguistics in translation studies: Bridging gaps and advancing research. Architecture Image Studies, 6(3), 1246–1257. https://doi.org/10.62754/ais.v6i3.436

Alzahrani, N. (2026). AI-First Critique Learning (AFCL): A framework for restoring assessment integrity in the age of Generative AI. International Journal of AI in Pedagogy, Innovation, and Learning Futures, 2026(1), 1–23. https://doi.org/10.46787/ijaipil.v2026i1.6945

Agustini, N. P. O. (2023). Examining the role of ChatGPT as a learning tool in promoting students’ English language learning autonomy relevant to Kurikulum Merdeka Belajar. Edukasia: Jurnal Pendidikan Dan Pembelajaran, 4(2), 921–934. https://doi.org/10.62775/edukasia.v4i2.373

Barrot, J. S. (2023). Using ChatGPT for second language writing: Pitfalls and potentials. Assessing Writing, 57, 100745. https://doi.org/10.1016/j.asw.2023.100745

Banihashem, S. K., Kerman, N. T., Noroozi, O., Moon, J., & Drachsler, H. (2024). Feedback sources in essay writing: peer-generated or AI-generated feedback? International Journal of Educational Technology in Higher Education, 21(1), 23. https://doi.org/10.1186/s41239-024-00455-4

Banihashem, S. K., Bond, M., Bergdahl, N., Khosravi, H., & Noroozi, O. (2025). A systematic mapping review at the intersection of artificial intelligence and self-regulated learning. International Journal of Educational Technology in Higher Education, 22(1), 50. https://doi.org/10.1186/s41239-025-00548-8

Baskara, F. R. (2023). Integrating ChatGPT into EFL writing instruction: Benefits and challenges. International Journal of Education and Learning, 5(1), 44–55. https://doi.org/10.31763/ijele.v5i1.858

Bewersdorff, A., Zhai, X., Roberts, J., & Nerdel, C. (2023). Myths, mis- and pre-conceptions of artificial intelligence: A review of the literature. Computers and Education: Artificial Intelligence, 4, 100143. https://doi.org/10.1016/j.caeai.2023.100143

Bitchener, J. (2008). Evidence in support of written corrective feedback. Journal of Second Language Writing, 17(2), 102–118. https://doi.org/10.1016/j.jslw.2007.11.004

Bitchener, J., & Ferris, D. R. (2012). Written corrective feedback in second language acquisition and writing. Routledge. https://doi.org/10.4324/9780203832400

Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing? A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3), 100068. https://doi.org/10.1016/j.rmal.2023.100068

Chan, C. K. Y. (2023). A comprehensive AI policy education framework for university teaching and learning. International Journal of Educational Technology in Higher Education, 20, 38. https://doi.org/10.1186/s41239-023-00408-3

Cotton, D. R. E., Cotton, P. A., & Shipway, J. R. (2024). Chatting and cheating: Ensuring academic integrity in the era of ChatGPT. Innovations in Education and Teaching International, 61(2), 228–239. https://doi.org/10.1080/14703297.2023.2190148

Creswell, J. W., & Plano Clark, V. L. (2011). Designing and conducting mixed methods research (2nd ed.). SAGE.

Crosthwaite, P., Ningrum, S., & Lee, I. (2022). Research trends in L2 written corrective feedback: A bibliometric analysis of three decades of Scopus-indexed research on L2 WCF. Journal of Second Language Writing, 58, 100934. https://doi.org/10.1016/j.jslw.2022.100934

Dai, W., Lin, J., Jin, H., Li, T., Tsai, Y.-S., Gašević, D., & Chen, G. (2023). Can large language models provide feedback to students? A case study on ChatGPT. In Proceedings of the 2023 IEEE International Conference on Advanced Learning Technologies (ICALT) (pp. 323–325). IEEE. https://doi.org/10.1109/ICALT58122.2023.00100

Escalante, J., Pack, A., & Barrett, A. (2023). AI-generated feedback on writing: Insights into efficacy and ENL student preference. International Journal of Educational Technology in Higher Education, 20, 57. https://doi.org/10.1186/s41239-023-00425-2

Evmenova, A. S., Regan, K., Mergen, R., & Hrisseh, R. (2024). Improving writing feedback for struggling writers: Generative AI to the rescue? TechTrends, 68(4), 790–802. https://doi.org/10.1007/s11528-024-00965-y

Farrokhnia, M., Latifi, S., Papadopoulos, P. M., Hogenkamp, L., Gijlers, H., Khosravi, H., & Noroozi, O. (2026). Generative AI offers more, but students revise less: Comparing the effects of teacher and AI feedback on student essay revisions. International Journal of Educational Technology in Higher Education, 23(1), 6. https://doi.org/10.1186/s41239-026-00579-9

Gerlich, M. (2025). AI tools in society: Impacts on cognitive offloading and the future of critical thinking. Societies, 15(1), 6. https://doi.org/10.3390/soc15010006

Gunes, A., & Liman Kaban, A. (2025). A Delphi study on ethical challenges and ensuring academic integrity regarding AI research in higher education. Higher Education Quarterly, 79(4), e70057. https://doi.org/10.1111/hequ.70057

Guo, K., & Wang, D. (2024). To resist it or to embrace it? Examining ChatGPT’s potential to support teacher feedback in EFL writing. Education and Information Technologies, 29(7), 8047–8073. https://doi.org/10.1007/s10639-023-12146-0

Guo, K., Pan, M., Li, Y., & Lai, C. (2024). Effects of an AI-supported approach to peer feedback on university EFL students’ feedback quality and writing ability. The Internet and Higher Education, 63, 100962. https://doi.org/10.1016/j.iheduc.2024.100962

Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research, 77(1), 81–112. https://doi.org/10.3102/003465430298487

Johnson, R. B., & Onwuegbuzie, A. J. (2004). Mixed methods research: A research paradigm whose time has come. Educational Researcher, 33(7), 14–26. https://doi.org/10.3102/0013189X033007014

Kang, E., & Han, Z. (2015). The efficacy of written corrective feedback in improving L2 written accuracy: A meta‐analysis. The Modern Language Journal, 99(1), 1–18. https://doi.org/10.1111/modl.12189

Kohnke, L., Moorhouse, B. L., & Zou, D. (2023). ChatGPT for language teaching and learning. RELC Journal, 54(2), 537–550. https://doi.org/10.1177/00336882231162868

Kostka, I., & Toncelli, R. (2023). Exploring applications of ChatGPT to English language teaching: Opportunities, challenges, and recommendations. TESL-EJ, 27(3), n3. https://doi.org/10.55593/ej.27107int

Kosmyna, N., Hauptmann, E., Yuan, Y. T., Situ, J., Liao, X. -H., Beresnitzky, A. V., Braunstein, I., & Maes, P. (2025). Your brain on ChatGPT: Accumulation of cognitive debt when using an AI assistant for essay writing task. arXiv. https://arxiv.org/abs/2506.08872

Lam, C., & Wolfe, J. (2023). An introduction to quasi-experimental research for technical and professional communication instructors. Journal of Business and Technical Communication, 37(2), 174–193. https://doi.org/10.1177/10506519221143111

Mizumoto, A., Shintani, N., Sasaki, M., & Teng, M. F. (2024). Testing the viability of ChatGPT as a companion in L2 writing accuracy assessment. Research Methods in Applied Linguistics, 3(2), 100116. https://doi.org/10.1016/j.rmal.2024.100116

Moorhouse, B. L., Wong, K. M., & Li, L. (2023). Teaching with technology in the post-pandemic digital age: Technological normalisation and AI-induced disruptions. RELC Journal, 54(2), 311–320. https://doi.org/10.1177/00336882231176929

OpenAI. (2023). GPT-4 technical report. https://arxiv.org/abs/2303.08774

Pfau, A., Polio, C., & Xu, Y. (2023). Exploring the potential of ChatGPT in assessing L2 writing accuracy for research purposes. Research Methods in Applied Linguistics, 2(3), 100083. https://doi.org/10.1016/j.rmal.2023.100083

Rassaei, E., & Jabbarpoor, S. (2025). Effects of textual enhancement on L2 development: A meta-analysis. International Review of Applied Linguistics in Language Teaching. https://doi.org/10.1515/iral-2025-0118

Rudolph, J., Tan, S., & Tan, S. (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching, 6(1), 342–363. https://doi.org/10.37074/jalt.2023.6.1.9

Sharples, M. (2025). A systems approach to AI and education in a post-digital world. Theory Into Practice, 64(4), 483–491. https://doi.org/10.1080/00405841.2025.2528549

Shaw, S. D., & Nave, G. (2026). Thinking-fast, slow, and artificial: How AI is reshaping human reasoning and the rise of cognitive surrender. SSRN. https://doi.org/10.2139/ssrn.6097646

Shi, H., & Aryadoust, V. (2024). A systematic review of AI-based automated written feedback research. ReCALL, 36(2), 187–209. https://doi.org/10.1017/S0958344023000265

Steiss, J., Tate, T., Graham, S., Cruz, J., Hebert, M., Wang, J., Moon, Y., Tseng, W., Warschauer, M., & Olson, C. B. (2024). Comparing the quality of human and ChatGPT feedback of students’ writing. Learning and Instruction, 91, 101894. https://doi.org/10.1016/j.learninstruc.2024.101894

Stewart, A. E., Rao, A., Michaels, A., Sun, C., Duran, N. D., Shute, V. J., & D’Mello, S. K. (2023, June). CPSCoach: The design and implementation of intelligent collaborative problem solving feedback. In N. Wang, G. Rebolledo-Mendez, N. Matsuda, O. C. Santos, & V. Dimitrova. (Eds.), International conference on artificial intelligence in education (pp. 695–700). Cham: Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-36272-9_58

Su, Y., Lin, Y., & Lai, C. (2023). Collaborating with ChatGPT in argumentative writing classrooms. Assessing Writing, 57, 100752. https://doi.org/10.1016/j.asw.2023.100752

Su, J., Guo, K., Chen, X., & Chu, S. K. W. (2024). Teaching artificial intelligence in K-12 classrooms: A scoping review. Interactive Learning Environments, 32(9), 5207–5226. https://doi.org/10.1080/10494820.2023.2212706

Teng, M. F. (2024). “ChatGPT is the companion, not enemies”: EFL learners’ perceptions and experiences in using ChatGPT for feedback in writing. Computers and Education: Artificial Intelligence, 7, 100270. https://doi.org/10.1016/j.caeai.2024.100270

Ting, L., Singh, C. K. S., Kiong, T. T., Rakhimjonov, F., & Singh, T. S. M. (2025). Mentor or examiner? A critical discourse analysis of AI-generated feedback in EFL writing education. International Journal of Academic Research in Progressive Education & Development, 14(4), 664–684. https://doi.org/10.6007/IJARPED/v14-i4/26651

Truscott, J. (1996). The case against grammar correction in L2 writing classes. Language Learning, 46(2), 327–369. https://doi.org/10.1111/j.1467-1770.1996.tb01238.x

von Garrel, J., & Mayer, J. (2023). Artificial intelligence in studies-use of ChatGPT and AI-based tools among students in Germany. Humanities and Social Sciences Communications, 10, Article 1. https://doi.org/10.1057/s41599-023-02304-7

Wang, W. S., Lin, C. J., Lee, H. Y., Huang, Y. M., & Wu, T. T. (2025). Integrating feedback mechanisms and ChatGPT for VR-based experiential learning: Impacts on reflective thinking and AIoT physical hands-on tasks. Interactive Learning Environments, 33(2), 1770–1787.

Xiao, F., Zhu, S., & Xin, W. (2026). Exploring the landscape of generative AI (ChatGPT)-powered writing instruction in English as a foreign language education: A scoping review. ECNU Review of Education, 9(1), 1–19. https://doi.org/10.1177/20965311241310881

Yamashita, T. (2025). Exploring potential biases in GPT-4o’s ratings of English language learners’ essays. Language Testing, 42(3), 344–358. https://doi.org/10.1177/02655322251329435

Yoon, S. Y., Miszoglad, E., & Pierce, L. R. (2023). Evaluation of ChatGPT feedback on ELL writers’ coherence and cohesion. arXiv:2310.06505. arXiv Preprint.

Zhan, Y., & Yan, Z. (2025). Students’ engagement with ChatGPT feedback: Implications for student feedback literacy in the context of generative artificial intelligence. Assessment & Evaluation in Higher Education. https://doi.org/10.1080/02602938.2025.2471821

Zhang, J., & Wang, J. (2025). Student perceptions of hybrid feedback: Using Gen-AI to enhance engagement with EAP writing feedback. The JALT CALL Journal, 21(2), 2175–2175. https://doi.org/10.29140/jaltcall.v21n2.2175

Zheng, Y., Yu, S., & Liu, Z. (2023). Understanding individual differences in lower-proficiency students’ engagement with teacher-written corrective feedback. Teaching in Higher Education, 28(2), 301–321. https://doi.org/10.1080/13562517.2020.1806225

Zou, S., Guo, K., Wang, J., & Liu, Y. (2025). Investigating students’ uptake of teacher- and ChatGPT-generated feedback in EFL writing: A comparison study. Computer Assisted Language Learning. https://doi.org/10.1080/09588221.2024.2447279

Article Sidebar

Main Article Content

Abstract

Article Details

References