OpenAI's Latest Study: Tackling Hallucinations in ChatGPT
OpenAI aims to reduce the instances of logical inaccuracies often referred to as hallucinations in ChatGPT. Such aberrations result in the generation of imprecise or entirely concocted data.
Hallucinations can take various forms, including distorted facts, made-up individuals, events, or entire narratives. The presence of these is especially noticeable when using ChatGPT, given the large number of users generating an abundance of unintentional hallucinations. Although the developers openly warn about the risks of providing inaccurate information, most tend to overlook this.
ChatGPT hallucination examples
1. In April 2023, ChatGPT levelled a false sexual assault accusation against the well-known Professor Jonathan Terli, citing a non-existent article from The Washington Post. This blunder took place during a research study centred around sexual harassment within tertiary education institutions, with attempts to rectify this erroneous content proving fruitless: the AI began citing an ongoing court case.
2. Prominent lawyer Steven Schwartz used ChatGPT to prepare for a court appearance in May 2023, disregarding data verification. As a result, the court rejected the submitted materials as most of the selected cases turned out to be fabricated or incorrectly amended.
3. In June 2023, ChatGPT reviewed case documentation submitted by journalist Fred Rill, subsequently misattributing financial crimes to broadcaster Mark Walters of a television company. However, the accused individual decided to capitalize on the sensational situation by launching a lawsuit against OpenAI.
It's crucial to acknowledge upfront that many legal professionals harbor significant pessimism regarding such situations. Given the inherent difficulty in pinpointing the original information source and proving the culpability of the company – considering the existing warnings – they tend to be sceptical. Hence, individuals should make more mindful use of new technologies rather than aimlessly attempting to monetize out of thin air, as OpenAI works to find a solution to curtail this widespread issue.
OpenAI research
OpenAI is making significant strides in tackling hallucinations - inaccuracies or completely fabricated information - in ChatGPT. They are exploring a range of techniques to increase the accuracy of data generation significantly. One such method involves training two reward models: process supervision (rewarding each correct step) and outcome supervision (rewarding only the final answer).
We evaluate our process-supervised and outcome supervised reward models using problems from the math test set. We generate many solutions for each problem and then pick the solution ranked the highest by each reward model© Official statement from OpenAI
The results are encouraging. Process supervision, where the model learns from interim responses approved by humans, showed high accuracy of 78.2%. While this method has proven effective and consistent, it currently applies primarily to mathematical tasks. To expand this research, OpenAI has shared a dataset and invited advanced users for testing.