What is the future of research?

This is on many of our minds after the release of Generative AI tools and features called “Deep Research” from Google, OpenAI, Perplexity, and other companies. 

Although academics may define research as an investigation or experimentation aimed at the discovery and interpretation of facts, most people probably view it more broadly as the collecting of information about a particular subject (Merriam-Webster).

And in today’s world, looking something up online is what often passes for research. We research insurance policies, restaurant menus, medical issues, history facts, travel destinations, stocks, you name it. Google’s search engine is so dominant, we even call this behavior ‘googling’. 

But the sheer amount of information available on the internet has become overwhelming. Google usually serves up a list of links that we then have to click/tap to find out more. Influencers and social media give us images and video content we have to sift through.

The problem of information overload is getting worse and worse.

Enter Generative AI and Deep Research

Generative AI has jumped in to solve this problem by generating responses through its statistically-powered systems that look like answers, even if there is inaccurate information created along the way. People are treating ChatGPT and the like as search engine alternatives, often without realizing how they differ. 

“Deep Research” tools are the next frontier in Gen. AI, promising to save you the hassle of looking through the internet for information. You set them a task and they will do the research for you and then digest key points into a concise report, with a works cited list too.

It’s not just everyday looking-up-stuff research, either. The company behind ChatGPT, OpenAI, has mentioned the idea of AI having PhD-level capabilities, and their models have performed similarly to human PhD students in science, coding, and mathematics tests (Benj Edwards, ArsTechnica). Whether or not this level is possible, the idea is out there that AI may open up access to ‘smarts’ usually reserved for the elite.

How good are these Deep Research tools?

Various people including podcasters, educators, and business people have run tests with these Deep Research tools and posted about their findings. Despite some reservations, many are impressed and see the ways the tools will help those who rely on internet searches for their day-to-day life and work. 

Jordan Wilson on the EverydayAI podcast tested several of the Deep Research tools by prompting for research about himself, as did Isar Meitis from the Leveraging AI podcast. They found some inaccuracies/hallucinations but were overall impressed with what the tools uncovered on the internet about their backgrounds and work. 

Allie K. Miller showed how You.com’s Advanced Research & Insights (ARI) tool can do market research and produce a McKinsey-like report, using the sample prompt: 

Please create a report for me that summarizes all recent trends in SEO, paying particular attention to AI is changing SEO. This might include how to use AI to improve SEO and how I now need to sell and influence AI systems to find my website. 

The result was a downloadable PDF file including a cover page with image, executive summary, table of contents, and in-text citations.

Nicole Leffer wrote about some less expected examples of ways to use ChatGPT’s Deep Research, including reading scientific sources on a medical issue to create a plan for relief and conducting family history research. She also posted about 8 tips for getting the most out of the tool, such as writing strong prompts, specifying source types, and testing the limits of the tool. 

Ola Handford found ChatGPT’s Deep Research to be a great tool for certain tasks like exploratory or market research, offering the following example prompt:

Examine the development, ethical implications, and psychological impact of Al companions and griefbots. Investigate their technological evolution, design, and implementation, particularly in the context of bereavement support and emotional companionship. Analyze how these Al-driven entities affect human coping mechanisms, attachment patterns, and mental well-being. Explore societal and cultural perceptions of griefbots, their potential benefits and risks, and the ethical considerations surrounding digital immortality, consent, and data privacy. Compare current models and case studies to assess their effectiveness and limitations in providing emotional support. Consider long-term implications for human relationships, identity, and the concept of digital afterlife in an Al-mediated world. Use both academic and popular sources.

Michael Hanegan detailed his experimentation with using ChatGPT’s Deep Research to write two reports on Gen. AI and its impact on the future of work, using the following prompt:

Compile a research report on the impact of Generative AI on the future of work. I want the report to focus on the macro level and not be industry specific except in minor, illustrative explanations.

He found that the reports had immediate value and saved many hours of time, leading him to conclude this represents an exponential speeding-up of learning and knowledge production. His Substack post also includes both the reports and an audio version for anyone to review for themselves.

Mushtaq Bilal tested ChatGPT’s Deep Research on the topic of children’s literature with the prompts:

How can we read children’s literature as a world literature? I would like to theorize children’s literature as world literature.

He showed that the tool can create a well-researched article with real references that with some editing could be publishable, and that the tool shows the thinking process which can help researchers with their own thought process. 

Olivia Inwood tested ChatGPT’s Deep Research to see how it would do trying to write a literature review for a PhD thesis (linguistics), starting with a basic prompt and then refining: 

Write a literature review on systemic functional linguistics research about social media data.

She found a surface-level understanding of theory, factual errors, and low source quality, but also good ideas and adequate mapping of broader themes. She noted that part of what makes a lit review, at least in the humanities, is having to grapple with a lot of information and understand the already-existing research community, which an AI tool doesn’t do.

Ethan Mollick discussed the need to review AI Deep Research outputs with a critical eye because you’re likely to stop being so critical after you get used to the tools. He advised starting with areas you’re an expert in (others start with their bios or research focus areas) so you can judge it and the source quality more easily. In one of his tests, he used the following prompt in his area of expertise:

When should startups stop exploring and begin to scale? I want you to examine the academic research on this topic, focusing on high quality papers and RCTs, including dealing with problematic definitions and conflicts between common wisdom and the research. Present the results for a graduate-level discussion of this issue.

The results satisfied him that this technology seems very good and was able to better engage with academic literature than previous iterations. 

Leon Furze described the outputs of Deep Research as a long summary of search results rather than a deep, high-quality investigation in the traditional sense of the term research. He tested three Deep Research tools with the prompt:

Research changing international laws about deepfakes since 2020

The results led him to conclude that Deep Research does represent a step up in terms of source quality and results accuracy, but it lacks analysis of the information it puts together. 

In my own experimentation with Google’s Deep Research, I saw the attraction in having the internet search process conducted automatically, but also a compounding of the issues in leaning so heavily on what’s accessible online. For topics with a lot of internet coverage, Deep Research tools can work really well. But they face significant limitations, including an inability to access paywalled academic research and an inability to determine the quality of blogs, articles, and website content. They may place a blog post or Reddit thread alongside authoritative news sources because they all happen to be related to the topic. This risks flattening the information field and making all sources seem to be of equal weight. (Although part of that is academia’s own fault in allowing work to be paywalled by publishers and largely eschewing contributing to sites like Wikipedia or blogging about their research…)

What about academic and scientific research?

In the face of a proliferation of AI Deep Research tools, one risk is that sustained, in-depth academic and scientific research may lose its luster and perceived value. The very term ‘Deep Research’ threatens to make all research seem the same and easily accessible to everyone. If anyone can spend a few minutes doing ‘deep research’ with a prompt and click of a button, if everyday people think they are doing PhD-level research with these tools, there may be less appetite for funding high-quality academic and scientific research that takes time and energy.

Another risk is that academic research becomes increasingly less impactful as it gathers in paywalled places that cannot be accessed by Deep Research tools. People may rely on Deep Research’s inclusion of sources as a proxy for quality, even if the sources are random websites. There is a chance that Wikipedia will have its revenge and emerge as the most reliable information source online after decades of being scorned by academics—after all, it can be easily accessed by AI tools.

Meanwhile, for more complex research, AI tools such as Google’s ‘AI co-scientist’ are proving themselves useful to researchers. The BBC article AI cracks superbug problem in two days that took scientists years reports on a team at Imperial College London who used Google’s ‘AI co-scientist’ and prompted it to explore an antibiotics resistance problem they had worked on for years and hadn’t yet published anything about. Within 48 hours it had reached the same conclusion as the team had and also drafted other logical hypotheses. Google says that its tool is designed to go beyond literature reviews and even deep research tools to “uncover new, original knowledge and to formulate demonstrably novel research hypotheses and proposals”. These kinds of tools signal a future of AI-augmented research that blends traditional scientific inquiry with AI assistance.

Whether it’s “Deep Research” AI tools or tools tailored for more specific research purposes, all of them demand a certain level of AI literacy for the user to understand the strengths and limitations of the information being presented as well as the process of gathering and/or generating it. Having a solid understanding of traditional research methods will remain important at least in the short term, but the temptation of The Button that Mollick described back in 2023 may prove irresistible.

The internet changed the nature of everyday research for many around the world. Now AI tools like “Deep Research” are taking us down another path to deal with the problem of information overload the internet unleashed. 

Categories: News