Tests in a GenAI World

I teach a graduate level data management class at the University of Maryland, Baltimore County (UMBC). Let me preface this by saying that the midterm and final exams for my course are all online, open book, open notes. Despite this, and despite all my warnings about the dire consequences of plagiarism, I routinely catch students cheating on exams. I really love teaching, and it is very disheartening for me when I have to dole out the consequences for this behavior. Usually, cheating comes in the form of students copying answers from the internet without attribution or from other students. This semester was different. This semester, students had AI.

It is clear that students will use Generative AI to generate answers for academic exams. However I wanted to share my thoughts on this beyond, “this is bad’.

They Are Cheating Themselves

The biggest issue that I have is that a student ultimately is cheating themselves. My exam questions are meant to give the student an opportunity to think about the subject matter and how they would solve a particular problem. If the student simply asks ChatGPT for the answer, they are missing out on this opportunity. Now, I wrote a post on LinkedIn and asked about this. A good number of people responded that I should embrace the new technology. I’m all about embracing new technology,

How to Defeat GenAI on Tests

Let’s start with the easiest way: paper tests. I’ve made the decision that future tests in my class will be in-class pen and paper exams. Secondly OpenAI actually provides an API to assess whether text was written by GenAI. This is not a fool proof method, but if you see consistently high scores on this, you can have a pretty good idea of whether or not answers were generated by AI.

The other thing that I did was I actually ran all my short answer test questions through ChatGPT to see what kind of answers it produced. After a while, you start to pick up on its style of answering questions. When you see that, it becomes immediately obvious which students are using GPT and which are not. The GPT models are not deterministic, so you can’t do an exact comparison, but you will definitely see strong similarities.

For example, one of my test questions was: What is meant by database normalization? When (if ever) wouldn’t you want to normalize a database? Here’s ChatGPT’s answer:

Database normalization is the process of organizing data in a relational database to eliminate redundancy and dependency problems. It involves breaking down a database into multiple related tables and defining relationships between them. The primary goal of normalization is to reduce data duplication and ensure data integrity, consistency, and efficiency.

Normalization is achieved by following a set of rules, known as normal forms, which define various levels of organization. The most commonly used normal forms are:

First Normal Form (1NF): Ensures that each column contains only atomic values, and there are no repeating groups.

Second Normal Form (2NF): Builds upon 1NF and requires that each non-key attribute is fully functionally dependent on the entire primary key.

Third Normal Form (3NF): Builds upon 2NF and requires that non-key attributes are not transitively dependent on the primary key. In other words, there should be no indirect relationships between non-key attributes.

There are higher normal forms beyond 3NF, such as Boyce-Codd Normal Form (BCNF) and Fourth Normal Form (4NF), which address further dependencies and anomalies. However, normalization beyond 3NF is not always necessary for every database.

Normalization is highly beneficial for most databases as it offers several advantages, including:

Reducing data redundancy: By eliminating duplicate data, normalization saves storage space and ensures consistency.

Enhancing data integrity: Normalization prevents update anomalies, such as inconsistencies and contradictions that may occur when data is duplicated across tables.

Simplifying data maintenance: By breaking down data into smaller, manageable tables, modifications and updates become easier and less error-prone.

Improving query performance: Well-normalized databases often exhibit better query performance due to optimized table structures and minimized data duplication.

However, there are cases where normalization may not be the best approach:

Performance-critical systems: In some high-performance systems, denormalization techniques may be employed to improve query performance by trading off redundancy for speed. This is commonly done in data warehousing or reporting applications where read performance is crucial.

Small-scale or simple databases: For small-scale databases with limited data and simple relationships, the benefits of normalization may not outweigh the added complexity. In such cases, denormalized or partially normalized structures can be more straightforward and efficient.

Frequently changing requirements: If the database requirements are highly volatile and subject to frequent changes, excessive normalization can hinder flexibility and make schema modifications more challenging.

It’s important to note that the decision to normalize or denormalize a database depends on various factors, including the specific requirements of the application, performance considerations, and the trade-offs between data integrity, simplicity, and efficiency.
Chat GPT

The instructions for my tests say that these are intended to be short answer questions, so if a student submits something like this, I know immediately that they didn’t write it. If the student was smarter, they’d write a prompt like this:

In 3-5 sentences, answer the following: What is meant by database normalization? When (if ever) wouldn’t you want to normalize a database?

This produces the following answer:

Database normalization refers to the process of structuring and organizing a relational database to minimize redundancy and dependency issues. It involves breaking down data into multiple tables and defining relationships between them. A database may not need normalization in scenarios where performance is a top priority, the database is small-scale with simple relationships, or the requirements are highly volatile, which may hinder flexibility and make schema modifications challenging. However, normalization is generally recommended for most databases to ensure data integrity, consistency, and efficiency.
Chat GPT

GenAI Isn’t Always Correct

The other big thing that students don’t seem to be aware of is that just because GenAI produced an answer, doesn’t mean that the answer is correct. We saw this recently when an attorney submitted a legal brief that was generated by GenAI that contained all manner of bogus citations. Students should not assume that the answers generated by AI are correct.

Citations Are Still Required

The biggest issue that I have with using Generative AI in an academic context is the lack of citations. When one submits academic work, you are certifying that the work you submit is your own. However, if this work is generated by a computer, it is no longer your own work, and thus requires citation. My policy going forward is that the failure to cite GenAI as a source, will result in an immediate zero for the assignment.

These are my thoughts on the matter. What do you think?

Share the joy

Tests in a GenAI World

They Are Cheating Themselves

How to Defeat GenAI on Tests

GenAI Isn’t Always Correct

Citations Are Still Required

Share this:

Related