As artificial intelligence (AI) has matured and the general public has had more opportunities to figure it out, a few specific use cases have risen from the discourse. Perhaps one day I will investigate these other use cases, but for this article, I want to focus on the idea that AI is good for finding sources, in particular, ChatGPT. Papers With Code, a research paper aggregator focused on AI performance, with ChatGPT’s factual accuracy reaching just shy of 90% on the best of days to as low as 51%. When this became apparent to consumers, some started to argue that AI like ChatGPT shouldn’t be used for providing facts, but rather, providing sources for facts, a similar role that many use Wikipedia and other online encyclopedias for. And so, I decided to put this to the test.

My first attempt actually involved a different article of mine, and is what prompted me to give this a go. I was investigating mentions of usury/high interest, and wanted to make a historical connection to various religious texts. A quick Google search found my sources for Abrahamic religions (eg. Judaism, Christianity, and Islam), but I was having a bit more difficulty finding any reliable information about Hinduism. So, I decided to ask ChatGPT. My focus was on the Vedas, a set of particularly spiritual Hindu scriptures. I asked ChatGPT if there were any mentions of usury in the Vedas. Lo and behold, it gave me some answers!

I thought I was set. It didn’t write out the specific hymns, so I copied and pasted the sources it gave into some online versions of the text. However, when navigating to the relevant book/chapter and hymn, I found no mentions of usury. They are, of course, hymns, so I read some of the adjacent hymns to see if perhaps the hymns were being more symbolic about the topic, but nothing. I went back to ChatGPT, saying that it was wrong, to which it gave a different set of hymns. Again, navigating to the relevant sections, nothing. I’d copy and paste the text of the hymns to ChatGPT, asking if this was discussing usury. It gave an apologetic answer, saying it wasn’t about usury, and leading me to a different set of hymns to read. Again, nothing.

This was a disappointing start. I’ve been a fairly outspoken critic of artificial intelligence for a while, and with being surrounded by peers who use AI, I’ve been presented with plenty of claims of its effectiveness in certain areas, source finding included. Nevertheless, I wanted to give AI a fair chance. Hindu scripture is not my personal expertise, so I turned to two areas I am a bit more knowledgeable in: physics and teaching.

For physics, I chose a topic that is somewhere in the middle of the field in terms of difficulty: neutrino flavor mixing. What you need to know is that the concept is simple enough that it can be discussed in a simplified format in introductory quantum physics courses, but complex enough that is still being actively studied. I asked ChatGPT where I could read more about the topic. It gave four groups of sources: textbooks, papers, online sources, and online lectures. I decided to test the first resource in each group. The textbook ChatGPT gave back was indeed a real textbook (“Introduction to High Energy Physics” by Donald H. Perkins), and flipping to the relevant page, it gave the same introductory quantum physics lecture I had gotten with a bit more meat, though with the same simplification my quantum course used.

The review paper it spit back was somewhat out of date, being from 2003, but provided good historical background on the state of neutrino research at the time (with a focus on solar neutrinos — I am unsure what the state of other neutrino research was at the time). The online source was a .gov website and was amazing. It contained a 2024 resource which included an introduction on the non-simplified version of neutrino flavor mixing, a review of neutrinos from a variety of sources (not just solar), as well as data tables on relevant parameters. The online lectures were in the middle in terms of depth, but I was surprised by the particular video lecture it recommended — MIT OpenCourseWare, which only had a few thousand views on its neutrino flavor mixing lectures. Despite some of my criticism, overall ChatGPT gave some good sources for this not-too-obscure not-too-introductory physics topic.

My next topic was pedagogy regarding classroom management. I chose this topic because unlike physics, this topic might have some conflicting opinions and could reveal biases or spoilers in the data set ChatGPT was trained on. ChatGPT return six groups of resources: books, journals, websites, professional organizations, YouTube channels, and social media. To not bore too long, I’ll give a quick summary of each, using the same method as I did for the physics information.

Unfortunately, I had to synthesize the book through a few reviews available through the Galvin Library, but it seems reviews are mixed. One review praised the book, while another was critical of its more traditional approach, and overall, the book seemed to be hit or miss in its agreement with current pedagogical theory. The journal was good, it is actively publishing and is on the cutting edge of pedagogical research, though doesn’t have an explicit focus on classroom management. The website was disappointing. Its focus seemed to be on publishing articles made by teachers, which sounds good until you realize these are less like academic articles and more like op-eds, and again, not a focus on classroom management. The professional organization recommendation from ChatGPT was just bad. It recommended a teacher’s union and policy advocate organization, great for learning about those topics, but had very little information about pedagogy or classroom management. The YouTube channel did claim to have a pedagogical focus, but in the few videos I watched about classroom management, never related back to pedagogical theory, and like the website, was more editorial and life experience than tested theory. The final resources was hashtags on X (which ChatGPT referred to as Twitter). Hopefully, I don’t have to explain why X is not a good resource for learning about classroom management, given everything else I’ve said.

So, is AI good for finding sources? It depends. From my very surface-level investigation, it appears that AI might be dependable for finding sources related to the physical sciences. Topics that are unlikely to change, have a high degree of academic documentation, and are well agreed upon are probably more likely to have good sources that AIs are aware of. Topics that are more controversial, more discussed by a wider topic, have a high degree of specificity, are not well documented, or require some higher-order interpretation (as was the case with my usury research) are probably a no-go. And if you know how large language models (LLMs) like ChatGPT work, this becomes much more obvious.

Without going into too much detail, LLMs are very advanced auto-fill. They take a piece of text that you have written, and try to guess what the next text would be. Don’t take my word for it; Google’s own introductory article on LLMs says just about that. For ChatGPT’s general question-and-answer format, it has been trained on datasets that are in a question and format answer. When it sees a question you have sent, it guesses what the answer would look like based on how its data sets have answered questions (very roughly). ChatGPT does not know what a right or wrong answer is, only what an answer looks like. Therefore, its efficacy is directly tied to the datasets it’s trained on. If it’s been trained on a lot of physics journals and forums, it will be really good at answering physics questions. If it’s been trained on tainted or biased data, its answers will be tainted or biased. It cannot think for itself. For example, me asking it to find examples of usury in the Vedas was a bad question. It probably doesn’t have access to the Vedas, but even if it does, AI cannot think about the Vedas and find mentions of usury. The answers it gave me were based on other people asking questions about the Vedas. That’s why it was able to give me real books, chapters, and hymns, it just had no idea what was actually in those hymns. So no, I do not believe AI is good for finding sources. In some specific circumstances, it might be okay, but you’d have to be knowledgeable on the quirks and construction of LLMs, or be willing to test the quality of its sources, to know if its worthwhile. And in the end, there are much better resources for finding sources. Libraries, experts (which we have ready access to as college students), journals, and so much more are at our fingertips. Why trust a fancy autofill when so much better is available?

Related Posts