Your PhD Expert? Don't hold your breath.
As with some 700 million other people, I've been playing around with ChatGPT 5 and... it's okay. By this, I mean that it is one of the leading AIs out there with sophisticated applications (though I think I still prefer Gemini and NotebookLM), but on the whole - so far - it feels like a modest upgrade rather than the real step change that we saw with the move from 3.5 to 4. One thing it does very well, however, is to streamline the whole nonsense about model 4o vs mini/high/o3/magic-kingdom (I might have made the last one up). Just as Steve Jobs, on his return to Apple, trashed a whole confusing range of Mac models, so presenting users with less choice really does make more sense.
What I'm especially interested here, however, is the claim by OpenAI's CEO, Sam Altman's, that ChatGPT 5 is like having a “team of Ph.D. level experts in your pocket.” As version 5 is available in the free version (with some apparent restrictions, which I anticipate will be related to the number of calls you can make to that model), I thought I'd check it out with a particular problem that I have worked with PhD students on to see how it handled the task. The short version: it looks great - until you pay more attention. In terms of literary studies (my own discipline), it is very much a case of caveat emptor.
Before giving a rundown of why I think you need to take Altman's claims with a pinch of salt, there are a number of caveats: this was a day one quick test and is utterly unrigorous in its approach. What's more, I really want to experiment with this at length because, as is often the case with AI, I think it will respond much better with structured guidance rather than my deliberately naive approach. I also want to make it very clear that my observations relate to one discipline only, and that in other areas I would expect it to perform much better out of the box. Although I started in humanities (and still do a lot of research around William Blake), I moved into different areas of tech and communications after my degree. Indeed, one of my first books involved coding for web sites, and in this regard I love many of the LLMs out there, which help me plough through a ton of tedious stuff to concentrate on what I like.
My "test" - thoroughly unscientific, deliberately naive - was to ask ChatGPT a question that wouldn't mean much to 99.9999% of the human population, but which has vexed Blake scholars for a number of years. I started with the free version and asked it:
If I was going to conduct actual PhD-level research in this field, I'd spend much more time crafting a research question and prompt, but I wanted to see how ChatGPT would respond to what is effectively a request for a literature review. The topic involves... I can feel your eyes glazing over already, but ChatGPT responded that this was a "Great question" and provided an answer that, while nowhere near PhD-level, addessed some of the primary considerations and would serve as something that could be worked up. To be honest, this pushes most of the work back on to you ("Read secondary literature for context" without telling me which papers and chapters might be helpful), which is what I've generally come to expect from ChatGPT in general: don't rely on it for research and I've included the response at the end of this post.
Turning to the ChatGPT Pro/Team (I use Team as part of my work), the response was different by an order of magnitude. I've used the deep research tools before, but to be honest migrated to NotebookLM. So this was an excuse to try it again, and my first impression was that I was extremely impressed. First of all, almost instantly it called me out on my half-baked question, asking me to clarify what I was looking for (you can see the response at the end of this post in the section headed "Prompts", with ChatGPT's responses in courier). It then took 8 minutes to find sources and just over 15 minutes to analyse them and produce a report (again, included at the end of this post), which absolutely blew me away when I opened it: here was a professional, detailed and extensive report which, at first glance, did indeed seem to be something capable of satisfying the requirements of PhD-level work.
And then I read it.
It reads extremely well, being fluent and well-structured. It is also comprehensive in its account - very comprehensive. It also lists its sources, which is immensely valuable... until I started clicking on her the links and discovered that out of 73 listed references (an impressive number!) 38 of them - just over half - returned me to the same document. I've had to sit on a number of academic offense panels over the years, and if I was presented this as an example of plagiarism it would (probably) not be found guilty because it lists all the sources, helpfully highlighting which parts it has adapted, but this is heavily, heavily derivative. I hasten to add that the original paper was very good and I've used it myself, but it comes in at just over 7,200 words. The ChatGPT article included below weighs in at nearly 10,500 words: it certainly hasn't saved me any time.
Some of that expansion comes from its additional sources... we'll leave aside the three references to Wikipedia (to be honest, it doesn't say anything wrong so I'm not going on a Wikipedia-bashing exercise here), but the best reference, to an article on editing The Four Zoas, is included only once. The article that would be most helpful in answering the question is simply noted rather than analysed because, apart from the first page, you need an institutional log-in to access the material. Equally disappointing, when dealing with some of the critical editions I'd asked it to consider, the response relies on reviews of those editions rather than the works themselves. If this was a PhD student of mine, at this point I'd probably explode at them.
And I know the difference because I have a thesis from a PhD student I supervised which answers the original question in about the two-thirds the length of the ChatGPT article but also introduces a significantly greater number of scholars that the AI misses for one very simple reason: the information is contained in books that are not openly available on the web, rather like the article on editing The Four Zoas which only used the first, publicly-accessible page.
This is, to repeat, an utterly unscientific test, and despite my critical observations I'm actually rather impressed by ChatGPT's response in many ways and will certainly play around with it more, but I can't see it replacing NotebookLM any time soon as my favourite research assistant. The good thing about Google's app is that you have to do a lot of the preparatory work, locating documents and links which are then loaded into NotebookLM so that it can go to work. I also appreciate that Google has made NotebookLM's outputs less polished, providing you with information that you have to sift and work on. This, in my opinion, is how AI has to work, augmenting our processes so that we can do more.
The problem for me with ChatGPT's approach is that it does an absolutely fantastic job of faking doing a PhD literature review - without actually doing a literature review. In this quick and dirty test, I bear some responsibility for not finely crafting my prompt, but where it struggles is with what Donald Rumsfeld infamously called "unknown unknowns", the information in books which it does not seem to be able to access (my own scepticism stems from the fact that ChatGPT's LLM is not open source, so we cannot know what it does or does not have access to). ChatGPT fails at what is a significant part of doing a PhD - explaining what knowledge exists in a field so that a researcher can build on it - but it sounds like it is doing a PhD, which brings me to my final point.
In their paper "Navigating the Jagged Technological Frontier", a group of academics engaged with Boston Consulting Group consultants, getting half of them to use ChatGPT 4 while the other half worked without AI. The former group saw their productivity rise immensely, writing more reports than the other group and seeing a general improvement in quality. However, when ChatGPT was working with false data, the problem was that after a while the consultants who used it were actually producing misleading work because they weren't reading it. The reports that they issued sounded much more professional than those who did not use ChatGPT but they contained data that was wrong. Their mistake was to assume that anything produced by generative AI was more than a first draft that needs to be checked and rewritten. For me, the conclusion is clear: you're not just the human in the loop when it comes to knowledge work, you're the human without whom the loop doesn't exist and you should always take responsibility for the tools you use.
What are the consequences of Blake's ordering of Night VII a and b in The Four Zoas? How do these affect our reading, and what is the primary research in this field?
To tailor the research effectively, could you clarify what you're most interested in regarding Blake's ordering of Night VII a and b in *The Four Zoas*? For example:
1. Are you focused on how this ordering affects narrative structure or thematic interpretation?
2. Do you want a summary of scholarly positions or debates on the ordering?
3. Are you interested in manuscript evidence, editorial decisions, or historical publication context?
4. Do you have a preferred set of scholars or editions (e.g., Erdman, Damon, Ostriker)?
Let me know so I can focus the research accordingly.
I am most interested in narrative structure, and debates about the ordering. I'm interested in all aspects of evaluation - historical, manuscript evidence and editorial decisions. My preferred set of scholars would be Erdman, Ostriker, Bentley and Keynes.
Great, I’ll research how the ordering of Night VII a and b in The Four Zoas impacts the poem’s narrative structure, and review scholarly debates surrounding this issue. I’ll pay special attention to historical and manuscript evidence as well as editorial decisions, with a focus on commentary and analyses by Erdman, Ostriker, Bentley, and Keynes.
I’ll let you know once I have the findings ready.
Sources used (8 minutes to find 20 sources)
Impact and Debates on Ordering of Night VII a and b in The Four Zoas
Searched for Four Zoas Night VII ordering
Read bq.blakearchive.org
ChatGPT
The article explores diverse scholarly perspectives on labeling two nights as "Night the Seventh" in Blake's manuscript. It highlights ongoing debates and the evolution of editorial interpretations.
Read more from bq.blakearchive.org
ChatGPT
The user points out that John Kilgore mentions Alicia Ostriker's stance in "William Blake: The Complete Poems (1977)" about Blake potentially omitting a section and later replacing it.
Read more from bq.blakearchive.org
Read romantic-circles.org
ChatGPT
Interestingly enough, Bentley's dedication to diplomacy is evident in the delicate referencing from Blake's The Four Zoas. This snippet is key and seems to span lines 5410-5420.
Read more from romantic-circles.org
ChatGPT
I’m thinking through the broader context in the PDF to ensure a comprehensive understanding.
Read more from romantic-circles.org
Read bq.blakearchive.org
ChatGPT
I’m evaluating variations in editorial decisions, specifically on transcribing visual elements and comparing older and newer texts for Rivière's work.
Read more from bq.blakearchive.org
ChatGPT
I’m working through manuscript pages, comparing two sections (VIIa and VIIb) and examining Erdman's decision to treat them as separate units.
Read more from bq.blakearchive.org
ChatGPT
This summary explores the Santa Cruz Blake Study Group's review, Erdman's editorial choices, and references to Alicia Ostriker's perspective on Blake's textual arrangement.
Searched for Alicia Ostriker Four Zoas VII
ChatGPT
Scanning the "William Blake: The Complete Poems."
Read more from bq.blakearchive.org
ChatGPT
I'm piecing together "Night VII a" and "Night VII b" by examining how the first distinct vowel sounds are introduced, noting their varied initial consonants.
Read more from bq.blakearchive.org
ChatGPT
Ostriker's integration of variants with italicization, as per Keynes, reflects presented material's delineation.
Read more from bq.blakearchive.org
ChatGPT
I’m mapping out Alicia Ostriker's editorial choices and Bentley's manuscript facsimile for The Four Zoas, focusing on the narrative impact of varying textual sequences.
Read blakequarterly.org
ChatGPT
I'm beginning to think that the snippet might not be directly accessible due to potential login restrictions on "blakequarterly.org".
Searched for Four Zoas Nights VIIa VIIb order
ChatGPT
I’m digging into the sequence and treatment of 'The Four Zoas' in Blake's editions, particularly focusing on how editions by Keynes and Ostriker handle variant passages and the sequence of nights like 7a and 7b.
Read en.wikipedia.org
ChatGPT
Checking scholarly references to clarify editorial nuances, especially in the last two nights.
Read more from en.wikipedia.org
ChatGPT
I’m analyzing how Bentley Jr., Ostriker, and Erdman address the order of Blake's manuscript, with Bentley suggesting 7a precedes 7b and focusing on their narrative implications.
Searched for Bentley Four Zoas ordering
Searched for Geoffrey Keynes Four Zoas VII
ChatGPT
Clicking on "The Revision of the Seventh and Eighth Nights of The Four Zoas" by Andrew Lincoln for a scholarly review.
Read bq.blakearchive.org
ChatGPT
I’m gathering info on how to craft a more seamless narrative by blending different segments of Andrew Lincoln's work. This effort includes blending annotations and sections to ensure a coherent flow.
Read academic.oup.com
ChatGPT
OK, let me see. Keynes's labelling and narrative choices have shaped the understanding of "Night VII" sections, leading to various editorial interpretations.
Read maypoleofwisdom.com
ChatGPT
I am analyzing "Energy and Archetype: a Jungian analysis of the four Zoas," potentially by Keynes, and searching for references to "two versions of seventh night."
Read more from maypoleofwisdom.com
ChatGPT
Keynes labels the revised "Night the Seventh" version as 7a and the earlier draft as 7b, possibly printing 7a first and 7b after the main text or as an appendix.
Read more from maypoleofwisdom.com
ChatGPT
Examining manuscript evidence, historical context, and editorial decisions to understand the sequence of Nights VIIa and VIIB, including Bentley and Keyne's interpretations and narrative coherence.
I’m mapping out the citations to ensure they are clear, consistent, and properly formatted, avoiding any potential policy issues. Each reference is meticulously cross-checked and unified.