The value in research gaps

There’s value in the holes. If you search for information about a topic and don’t find very much about it, it can be a clear signal that more research is necessary or desired to find the answers.

A research gap is exactly how Dr. Tina Lasisi, interviewed for the Melaninology episode of the Ologies podcast, describes how she got into this field of research [starting around 6:22]:

I have always been aware that different people have different skin color, but I never thought about how it was patterned around the world. What about other traits? How do those vary, and how do those evolve? And my immediate question as a Black woman, was okay what about my hair? Like okay, I understand why my skin is brown, but why is my hair curly? And the wild thing is there wasn’t a good answer! What should’ve been a really quick Wikipedia search that satisfied my curiosity, became this rabbit hole where I basically had this Postdoctoral Fellow who was at our college who took me under his wing and was like hey, let’s talk about BioAnth [Biological Anthropology], and I was like, oh, so I have all these questions and I can’t find anything about like, hair, and he basically was like well, sounds like that could be something for like your undergraduate thesis! And like as an undergraduate I decided ok let me get hair samples and measure them, and like yeah, long story short, basically this thing that should’ve been a short Wikipedia search ended up being a decade plus journey into understanding this trait and like why humans have it. (transcription mine)

I’d wager many other academics have a similar origin story for their own research projects.

Unfortunately, if you rely on a large language model to do research for you, or help give you ideas about what to research, the nature of the tool means that instead acknowledging that there are no results for your prompt, you instead can get output full of “best guess” citations based on words that are semantically similar to the words that you prompted.

Ben Davis writes about this phenomenon for Artnet in We Asked ChatGPT About Art Theory. It Led Us Down a Rabbit Hole So Perplexing We Had to Ask Hal Foster for a Reality Check. He and his colleagues found that if you attempt to perform research with ChatGPT, you are instead likely to receive a list of nonexistent citations:

Sometimes—if, for instance, I ask it to “give me a list of citations about the influence of Artificial Intelligence on European Medieval Art”—it accurately tells me that this query makes no sense—but then provides a list of made-up references anyway

Ben makes several attempts to get the system to provide references about a topic that likely has very little, if anything written about it, going so far as to specifically ask for citations that actually exist:

When I ask a follow-up, specifying that the references now be actually “real,” my chatbot helper is again very helpful, but again just makes stuff up

Ultimately, he concludes that this flaw is not a surprise, and is fact completely expected due to what ChatGPT is built on:

The glitch seems to be a linear consequence of the fact that so-called Large-Language Models are about predicting what sounds right, based on its huge data sets. As a commenter put it in an already-months-old post about the fake citations problem: “It’s a language model, and not a knowledge model.”

In other words, this is an application for sounding like an expert, not for being an expert—which is just so, so emblematic of our whole moment, right?

This flaw makes large language models a decent application for things like thought leadership, a common forum for sounding like an expert. But for circumstances where it truly matters to be an expert—like academia, or writing documentation—it’s a serious problem.

It’s one thing to confabulate thought leadership — it’s quite another to lie about product functionality in contractually offered software documentation.

There’s value in identifying the holes in knowledge and understanding. If you gloss over knowledge gaps with generated drivel, you miss out on the opportunity to dig deeper to learn about something—and so too does anyone that might discover what you learned in your research.