Session 2D: (Paper 2) Prompting generative AI to solve Maths problems

Contents

Dr Russell Gerrard (Associate Professor in Statistics) Bayes Business School, City, University of London
- [Paper 2]
- References

Dr Russell Gerrard (Associate Professor in Statistics) Bayes Business School, City, University of London

[Paper 2]

First-year students on a Mathematics module were instructed to prompt generative AI to solve the maths questions and assess the output as though they were the lecturers and the AI the student.

There are concerns that some students, by adroit use of generative AI, might gain an unfair advantage over their peers when completing take-home coursework. Appropriate use of Large Language Models is a skill much demanded in the modern workplace. To address these issues, students on a first-year Mathematics module (AS1056) were set a summative group assessment worth 7.5% of the module mark. They formed groups of 3 or 4 and were set a collection of 8 mathematics questions (6 questions for groups of 3). There were slight differences between the questions set to each group. Their tasks were (a) to craft careful prompts which would encourage two different genAI tools to solve the problem; (b) to assess the quality of the answer provided by the genAIs, putting themselves in the role of the lecturer and the genAI in the role of the student; (c) to review what they had learned about the capability of the genAI on which they had been invited to focus attention. Students were provided with a list of ten generative AI tools which claimed to have the ability to solve mathematical problems, or at least to guide users to solutions.

The results were that some groups who had solved the questions independently enjoyed notable success in prompting the AIs to provide good answers, whereas groups who seemed not to have solved the problems themselves were unable to distinguish between high-quality answers and answers containing errors; in many cases they merely indicated which of the two answers they preferred, without recognising when both answers were wrong.

Issues of this type are beginning to be addressed in the literature, so far primarily in conference presentations. The most directly relevant is Barana et al (2023), presented at CELDA 2023, although the element of prompt-crafting is underemphasised in their project and the mathematics questions submitted to Chat-GPT by their students were more elementary in nature than those which featured in the task for AS1056.

Participants will be provided with the instructions issued to students and with a sample set of questions. During the session I will share some of the better examples of prompt–crafting which the students developed to demonstrate what can be achieved in the way of guiding LLMs towards solutions to maths problems, and will also share some of the students’ evaluative comments on how well the generative AI tools did at performing the tasks allocated to them.

I hope that the session will be of interest to academic colleagues, particularly in quantitative subjects, who are finding that the current discussion of AI tools is largely focusing on the impact they have on essays and presentations, whereas the impact of genAI on quants subjects is completely different in nature.

Taking too long?

Reload document

Open in new tab

Download

References

Barana, A., Marchisio, M. and Roman, F., “Fostering Problem Solving and Critical Thinking in Mathematics through Generative Artificial Intelligence”, proceedings of the 20th International Conference on Cognitiion and Exploratory Learning in the Digital Age (CELDA 2023)