"

12 Policy Brief: Rethinking AI Detection Tools in Higher Education

Introduction

The rapid growth of generative artificial intelligence (AI) tools creates significant challenges for higher education institutions attempting to teach and maintain academic integrity. Some schools have adopted AI detection tools to combat AI-generated student work, yet research indicates these tools are sometimes unreliable, inequitable, and legally problematic. This policy brief summarizes the evidence to provide recommendations for upholding academic integrity while ensuring fairness, transparency, and trust in faculty-student relationships.

How AI Detectors Work

AI detection tools function by analyzing text for linguistic and statistical patterns associated with machine-generated content. AI detection tools often rely on perplexity and burstiness to differentiate human and AI-generated text. Perplexity measures how predictable a sequence of words is; lower perplexity suggests AI-generated text, as it tends to follow structured patterns, while human writing is more varied and unpredictable (Chaka, 2023). Burstiness refers to sentence variation—human writing mixes longer and shorter sentences, whereas AI text is more uniform, making it easier to detect (Chaka, 2023). However, as AI models improve, they increasingly mimic human-like variations, reducing the effectiveness of these detection techniques (Abd-Elaal et al., 2022).

The Challenges of AI Detection Tools

Reliability and Accuracy Issues

AI detection tools frequently fail to reliably differentiate between AI and human-written text. Studies show that detection models produce probability scores rather than definitive classifications, making them susceptible to adversarial techniques such as paraphrasing and prompt engineering (Baron, 2024; Chaka, 2024; Furze, 2024; Walters, 2023). Further, research by Sadasivan et al. (2024) and Elkhatat et al. (2023) highlights how modifications in text can significantly decrease detection accuracy. Additionally, testing of AI detectors found inconsistencies across platforms, raising concerns about their efficacy (Weber-Wulff et al., 2023; Wu et al., 2024).

Recent studies also indicate widely used AI detectors have false-positive rates ranging from 10% to 35%, with variations depending on the dataset used for evaluation (Ibrahim et al., 2023; Perkins, 2023). This variability further underscores why such tools should not be the sole determinant of academic misconduct (Nguyen, 2023).

Evolving Nature of AI and Detection Challenges

As AI models become more advanced, detection tools struggle to keep pace. AI-generated content is increasingly difficult to identify, leading to potential false accusations and a growing sense of distrust within academic communities (Nolan, 2024, Elkhart et al., 2023; Perkins, 2023). If choosing to use AI detection tools, this reinforces the need for continuous evaluation and adaptation of methodologies to ensure accuracy and reliability.

Student Perspectives on AI Usage

Understanding student attitudes toward AI is crucial. Surveys indicate that a significant number of students use AI tools for assessments without perceiving it as cheating (Johnston et al., 2024; Nguyen, 2023). This highlights the importance of clear guidelines and educational initiatives to define ethical AI use in academia. Institutions should prioritize AI literacy programs and guidelines to ensure students understand responsible AI usage in academic work.

Bias Against Non-Native English Writers

A growing body of research looks at the bias inherent in AI detection tools, particularly against non-native English speakers (Perkins et al., 2024; Walters, 2023). Detection models rely heavily on linguistic patterns that penalize non-standard syntax and vocabulary, leading to disproportionately high false-positive rates (Perkins et al., 2024; Liang et al., 2023; Otterbacher, 2023). These biases call into question the fairness of AI-based plagiarism enforcement (Perkins et al., 2024; MIT Sloan, 2024).

To further support this claim, controlled experiments show that non-native English texts are flagged as AI-generated at twice the rate of native English texts, even when the content was manually verified as human-written (McDonald et al., 2025; Casal & Kessler, 2023). This issue disproportionately affects international students, multilingual learners, and students with learning disabilities (Nguyen, 2023).

Legal and Privacy Risks

Many detection tools require submitting student work to external databases, potentially violating FERPA regulations and university privacy policies (McDonald et al., 2025). Additionally, research by McDonald et al. (2025) examines how institutional policies around AI use remain fragmented, highlighting the need for clearer guidelines on student data protection. These practices raise significant concerns about data privacy, intellectual property rights, and institutional liability (Johnston et al., 2024).

Negative Impact on Trust and Learning

False accusations of academic dishonesty can severely harm faculty-student relationships and negatively impact student learning (Eaton, 2024; Weber-Wulff et al., 2023). Studies by Luo (2024) emphasize the importance of trust in the learning environment, showing that excessive reliance on AI detection fosters a culture of suspicion rather than support. Furthermore, an overemphasis on AI policing distracts from more effective approaches to fostering academic integrity, such as assignment redesign and ethical AI literacy (Johnston et al., 2024; Furze, 2024; Fleckenstein et al., 2024; Perkins, 2023). Together these issues highlight the importance of requiring human oversight in academic integrity investigations, as well as a balance of tools used together (Coccoli & Patane, 2024; Ward, 2024).

Policy Recommendations

To effectively address the challenges of AI detection tools in higher education, institutions should implement the following recommendations:

  1. Do not use AI detection tools as sole evidence in academic misconduct cases.
    • AI detection tools have demonstrated inconsistent accuracy, leading to a high rate of false positives. Institutions should ensure that AI detection results are only one component of a broader academic integrity investigation.
  2. Require human review and multiple forms of evidence in all academic integrity investigations.
    • Given the risks of false accusations, AI-generated flags should always be supplemented with manual faculty review and alternative verification methods, such as reviewing writing history, style comparisons, and student interviews.
    • Establish clear protocols for handling suspected AI-generated content.
  3. Develop AI literacy programs to educate students and faculty on responsible AI use.
    • Literacy programs need to include basics in how generative AI functions, awareness of what programs utilize AI, ethical considerations for using AI, and analysis of impacts on academic integrity.
    • Institutions may establish clear guidelines on the ethical and acceptable use of AI in coursework. This includes training faculty and students on how AI works, its limitations, and responsible usage to avoid reliance on ineffective detection tools.
  4. Encourage faculty to establish clear guidelines for acceptable AI use.
    • Utilize syllabus statements and assignment directions to provide clear guidelines for when and how AI use is or is not acceptable. These guidelines may need to include which AI tools are acceptable, how they are to be used/not used, and what the expectation is for disclosure of use.
    • Coordinate with institutional student support services when crafting guidelines and/or policy statements (i.e. Dean of Students, Academic Integrity teams, etc).
  5. Implement clear data privacy policies ensuring student work is not stored in external AI detection databases.
    • Many AI detection tools require storing student work externally, raising legal and ethical concerns under regulations like FERPA. Universities should review detection tool policies to ensure compliance with privacy laws and institutional best practices.
  6. Encourage alternative assessment methods that reduce AI-assisted plagiarism risks.
    • Instead of relying solely on AI detection, universities should design assignments that emphasize critical thinking, originality, and focus on the process of learning to naturally deter misuse of AI. This is not about eliminating written assessment, but diversifying forms of assessment.
  7. Conduct regular audits of AI detection tool effectiveness and transparency.
    • AI models evolve rapidly, and detection software may become obsolete or inaccurate over time. Universities must continuously evaluate the accuracy and fairness of AI tools before institutional adoption and make adjustments as needed.

By implementing these policies, institutions can balance academic integrity with fairness, privacy, and trust, ensuring that AI detection tools are used ethically and effectively without compromising students’ rights or learning experiences.

Conclusion

AI detection tools present significant challenges in higher education, from reliability concerns to ethical and legal risks. Instead of prioritizing detection-based enforcement, institutions should focus on developing robust pedagogical strategies that promote integrity, transparency, and trust. Thoughtful policy development will ensure a balanced approach to AI’s role in academic settings, fostering environments where students and educators can engage with technology responsibly and ethically.

AI Disclosure

ChatGPT 4.0 and Claude 3.5 Sonnet were used to improve the readability. No AI was used to create original material for this document.

References

Abd-Elaal, E., Gamage, S. H. P. W., & Mills, J. E. (2022). Assisting academics to identify computer generated writing. European Journal of Engineering Education, 47(5), 725–745. https://doi.org/10.1080/03043797.2022.2046709

Baron, P. (2024). Are AI detection and plagiarism similarity scores worthwhile in the age of ChatGPT and other Generative AI? Scholarship of Teaching and Learning in the South, 8(2), 151–179. https://doi.org/10.36615/sotls.v8i2.411

Casal, J. E., & Kessler, M. (2023). Can linguists distinguish between ChatGPT/AI and human writing?: A study of research ethics and academic publishing. Research Methods in Applied Linguistics, 2(3), 100068. https://doi.org/10.1016/j.rmal.2023.100068

Chaka, C. (2023). Detecting AI content in responses generated by ChatGPT, YouChat, and Chatsonic: The case of five AI content detection tools. Journal of Applied Learning & Teaching, 6(2). https://doi.org/10.37074/jalt.2023.6.2.12

Chaka, C. (2024). Reviewing the performance of AI detection tools in differentiating between AI-generated and human-written texts: A literature and integrative hybrid review. Journal of Applied Learning and Teaching, 7(1), Article 1. https://doi.org/10.37074/jalt.2024.7.1.14

Coccoli, M., & Patanè, G. (2024). AI vs. AI: The Detection Game. 2024 IEEE 8th Forum on Research and Technologies for Society and Industry Innovation (RTSI), 1–6. https://doi.org/10.1109/RTSI61910.2024.10761124

Eaton, L. (2024, June 17). AI plagiarism considerations: Part 1. AI Education Simplified. https://aiedusimplified.substack.com/p/ai-plagiarism-considerations-part

Elkhatat, A. M., Elsaid, K., & Almeer, S. (2023). Evaluating the efficacy of AI content detection tools in differentiating between human and AI-generated text. International Journal for Educational Integrity, 19(1), Article 1. https://doi.org/10.1007/s40979-023-00140-5

Fleckenstein, J., Meyer, J., Jansen, T., Keller, S. D., Köller, O., & Möller, J. (2024). Do teachers spot AI? Evaluating the detectability of AI-generated texts among student essays. Computers and Education: Artificial Intelligence, 6. https://doi.org/10.1016/j.caeai.2024.100209

Furze, L. (2024, April 9). AI detection in education is a dead end. https://leonfurze.com/2024/04/09/ai-detection-in-education-is-a-dead-end/

Ibrahim, H., Liu, F., Asim, R., Battu, B., Benabderrahmane, S., Alhafni, B., Adnan, W., Alhanai, T., AlShebli, B., Baghdadi, R., Bélanger, J. J., Beretta, E., Celik, K., Chaqfeh, M., Daqaq, M. F., Bernoussi, Z. E., Fougnie, D., Garcia de Soto, B., Gandolfi, A., & Gyorgy, A. (2023). Perception, performance, and detectability of conversational artificial intelligence across 32 university courses. Scientific Reports, 13(1), 1–13. https://doi.org/10.1038/s41598-023-38964-3

Johnston, H., Wells, R. F., Shanks, E. M., Boey, T., & Parsons, B. N. (2024). Student perspectives on the use of generative artificial intelligence technologies in higher education. International Journal for Educational Integrity, 20(1), 2. https://doi.org/10.1007/s40979-024-00149-4

Liang, W., Yuksekgonul, M., Mao, Y., Wu, E., & Zou, J. (2023). GPT detectors are biased against non-native English writers (No. arXiv:2304.02819). arXiv. http://arxiv.org/abs/2304.02819

Luo (Jess), J. (2024). How does GenAI affect trust in teacher-student relationships? Insights from students’ assessment experiences. Teaching in Higher Education, 1–16. https://doi.org/10.1080/13562517.2024.2341005

McDonald, N., Johri, A., Ali, A., & Collier, A. H. (2025). Generative artificial intelligence in higher education: Evidence from an analysis of institutional policies and guidelines. Computers in Human Behavior: Artificial Humans, 3, 100121. https://doi.org/10.1016/j.chbah.2025.100121

MIT Sloan Teaching & Learning Technologies. (n.d.). AI detectors don’t work. Here’s what to do instead. Massachusetts Institute of Technology. https://mitsloanedtech.mit.edu/ai/teach/ai-detectors-dont-work/

Nguyen, Q. H. (2023). AI and Plagiarism: Opinion from Teachers, Administrators and Policymakers. Proceedings of the AsiaCALL International Conference, 4, 75–85. https://doi.org/10.54855/paic.2346

Nolan, B. (2024, September 30). AI plagiarism is spreading in US colleges. It’s left professors feeling confused and exhausted. Business Insider. https://www.businessinsider.com/ai-cheating-colleges-plagiarism-chatgpt-professor-2024-9

Otterbacher, J. (2023). Why technical solutions for detecting AI-generated content in research and education are insufficient. Patterns, 4(7).

Perkins, M. (2023). Academic Integrity considerations of AI Large Language Models in the post-pandemic era: ChatGPT and beyond. Journal of University Teaching and Learning Practice, 20(2). https://doi.org/10.53761/1.20.02.07

Perkins, M., Roe, J., Vu, B. H., Postma, D., Hickerson, D., McGaughran, J., & Khuat, Q. H. (2024). GenAI detection tools, adversarial techniques and implications for inclusivity in higher education. arXiv. https://arxiv.org/abs/2403.19148

Sadasivan, V. S., Kumar, A., Balasubramanian, S., Wang, W., & Feizi, S. (2024). Can AI-Generated Text be Reliably Detected? (No. arXiv:2303.11156). arXiv. http://arxiv.org/abs/2303.11156

Walters, W. H. (2023). The Effectiveness of Software Designed to Detect AI-Generated Writing: A Comparison of 16 AI Text Detectors. Open Information Science, 7(1), 20220158. https://doi.org/10.1515/opis-2022-0158

Ward, D. (2024, February 19). Careful use of AI detectors. Center for Teaching Excellence, University of Kansas. https://cte.ku.edu/careful-use-ai-detectors

Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S., Foltýnek, T., Guerrero-Dib, J., Popoola, O., Šigut, P., & Waddington, L. (2023). Testing of detection tools for AI-generated text. International Journal for Educational Integrity, 19(1), 26. https://doi.org/10.1007/s40979-023-00146-z

Wu, J., Yang, S., Zhan, R., Yuan, Y., Wong, D. F., & Chao, L. S. (2024). A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions (No. arXiv:2310.14724). arXiv. http://arxiv.org/abs/2310.14724

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

A Guide to Teaching and Learning with Artificial Intelligence Copyright © by heidiestrem; jasonblomquist; and lizalong is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.