Monday’s submit was what I referred to as Claude Code fan fiction — a supply-and-demand evaluation of what occurs to tutorial publishing when the price of producing a manuscript collapses. It went viral, which I didn’t anticipate. This submit is the companion piece, nevertheless it’s about one thing totally different. One thing that’s been bothering me extra.
Somebody on LinkedIn requested me to exhibit a “purely automated paper” as a result of they had been skeptical. So I filmed it. Right here’s the video. It’s lengthy even with me pausing, however you’ll get the gist if you happen to watch the primary half and skip round.
Thanks once more for supporting the substack. When you aren’t a paying subscriber, please take into account changing into one.
I went to mattress and awoke with a paper
Monday evening, 9:21 PM. I created a listing, pointed Claude Code to it, and gave it the vaguest immediate I may give you. I advised it I examine labor and psychological well being, that I’m curious about drug coverage, that I like Autor and Acemoglu and Ricardo and Smith. I mentioned I needed to see if it may make a paper with out me placing fingers on the wheel. I gave it entry to Mixtape Classes repos so it may see good code and actual slides. After which I put the Stephen King restricted sequence, 11.22.63, on and waited.
Claude Code selected a subject — marijuana legalization’s impact on employment and mortality. It crawled the online and located the info. It got here up with a analysis design: diff-in-diff utilizing Callaway and Sant’Anna with robustness from TWFE, Solar-Abraham, and Gardner. It was kind of accomplished by 10:45, however I went to mattress at 10:30.
It’s true that I didn’t fully take my fingers off the wheel. I used to be fairly often saying issues. However it wasn’t prompting — not precisely. It wasn’t writing both. It was automation minus epsilon. I advised it to write down like my favourite economics author, Martin Weitzman. He wrote with this troublesome to explain sort of rhetoric — ethical urgency, seriousness, excellent prose, as if what he was speaking about proper then was a very powerful factor that could possibly be talked about, due to how a lot was at stake, whereas nonetheless managing to sound like a scientist. A really bizarre ability that he pulled off his whole profession from day one till the tip, and that was the best way I needed this paper to sound.
Once I awoke, I ran it by way of my Referee 2 persona and submitted it to Refine.ink. That course of, together with my checking, caught actual bugs — one being a Solar-Abraham aggregation error that doubled the therapy impact, 295 false zeros within the incarceration information from a protection boundary. I paid about $100 whole throughout two rounds of Refine.ink, had Claude repair every part. Accomplished. Whole lively time: about 3.5 hours. You possibly can watch the video to see me do most of it.
I haven’t rigorously learn the paper but. However I’ve seen the occasion examine plots. And the one on wages has been arduous to disregard and made me surprise — will I learn this paper? Will I work on this paper now? Ought to I?
Are we asking the improper query?
Undertaking APE on the College of Zurich is pitting absolutely automated papers in opposition to AER and AEJ articles in head-to-head matchups. The AI papers win 4.7% of the time, however over time that’s grown to over 7%. Although the Elo hole is very large, sometimes a paper pops up in the best tail — as you’d anticipate from the conventional distribution which has infinite tails. And so naturally, the dialog has turn into: can AI papers compete on the high 5? And that’s additionally naturally met with incredulity.
However I don’t know if I agree that that’s the proper query. I don’t suppose that the coverage related or the scientifically related query is whether or not 3D printed manuscripts will ever be revealed within the AER since most economists by no means publish within the AER. It’s not the modal expertise for analysis productive economists to publish in high 5s.
Nearly all of analysis economists don’t publish within the AER. The mode variety of AER publications over a lifetime profession is, I’d guess, zero. Folks have very productive careers publishing within the second tier high 10 journals like REStat and EJ, however even then the spine of most economists’ vitas is spent publishing on the discipline journals — JHR, JOLE, JPubE, Financial Inquiry, Southern, Labour Economics, Well being Economics, and so on. The highest 40, not the highest 5. Tenure selections at excellent faculties are not often requiring their junior school to have a high 5.
So the true query isn’t whether or not an automatic paper can beat an AER article in a blind matchup. It’s whether or not it may compete on the Journal of Human Sources. Can it get previous the desk at Financial Inquiry? At JOLE? And after I take a look at this marijuana paper — customary diff-in-diff, publicly out there information, clear pre-trends, a query that matches squarely right into a well-established literature — the reply is: presumably. I don’t know but. Marijuana coverage isn’t my discipline, and I haven’t learn what the paper says rigorously sufficient to know the way it compares to the prevailing literature. However the strategies are boilerplate. The info is actual. The occasion research look cheap. And I audited the code with sub-agents 3 times, replicating the code in a number of programming languages, simply to search out the bugs. There doesn’t seem like any bugs. The info is actual.
Now, was the info appropriately dealt with? Have been the legal guidelines coded appropriately? Claude used information from the Harvard Dataverse hashish coverage for this mission, however did it course of it appropriately? How nicely did it clear the info? Are these wages appropriate? On and on. These I don’t know, however the factor is, these are additionally trivially answerable. They’re simpler to reply now than they ever have been, and in the event that they weren’t dealt with nicely, then they are often.
However the factor is — this sort of paper has a literature. It’s an lively analysis agenda for a number of individuals. RAND works on this subject. The repercussions of drug legalization is a crucial coverage query. It issues for individuals’s lives and the group of society itself. And the stakes are excessive getting the reply improper. So once more I ask — may this paper make it previous the desk, blinded, at JOLE? Possibly, possibly not. What about Journal of Inhabitants Economics? What about Journal of Well being Economics? Think about these papers.
-
Mathur & Ruhm (2023), “Marijuana legalization and opioid deaths,” Journal of Well being Economics — That is the closest hit. They discover that medical marijuana with retail dispensaries is related to larger opioid mortality, not decrease. Immediately related to the overdose end result within the 3D printed paper that Claude wrote and up to date.
-
Chakraborty, Doremus & Stith (2023), “The consequences of leisure marijuana legalization on employment and earnings” — This began as NBER WP 30813 and was revealed in Journal of Inhabitants Economics (2025). They discover little adversarial impact on labor markets for many working-age adults, with some agricultural employment positive factors. Very related query to Claude’s 3D printed paper.
-
Anderson, Hansen & Rees (2022), “The general public well being results of legalizing marijuana,” Journal of Financial Literature — A complete survey masking crime, mortality, and labor market results.
So that’s the actual query to me — may these 3D printed papers make it previous the desk, to referees, get R&Rs, and get revealed at high discipline journals? We don’t know the reply to that query, and it’s not apparent but what elements of that query are vital, and what elements of that query are usually not vital.
What occurs when marginal price is zero?
However put journals apart. I need to say one thing else. What pursuits me extra is one thing less complicated. If the marginal price of manufacturing a submission-quality manuscript — full with identification technique, actual information, replicable code — has collapsed to zero, then we should always anticipate this for use for any job the place the marginal profit is simply above zero. Take a look at this graphic and give it some thought for a second.
Traditionally, any sort of econometric software would solely be undertaken for high-value issues. Why? Nicely, it took so lengthy, could possibly be screwed up in so some ways, and the refereeing course of chewed up years of your life even ignoring the time it took to get to a manuscript. Even one thing as easy as diff-in-diff solely actually made sense to undertake if the query was vital sufficient to justify months of knowledge assortment, cleansing, coding and writing.
However I did this in 3.5 hours whereas watching Netflix. You possibly can drum up a diff-in-diff with precise information, accomplished kind of appropriately with robustness checks and occasion research, when you sleep. Which implies the marginal price of manufacturing a manuscript is now zero. Which implies we should always see these items used for trivial issues — low worth tasks, approach down the demand curve, now most likely are candidates for econometric evaluation, regardless of how trivial they might be.
So if that’s the case, then we should always see causal inference transfer into duties the place it may be helpful however the place the marginal worth was all the time type of small. Not career-making analysis. Not AER papers. Possibly issues that maintain you from getting fired out of your consulting job. Issues that settle an argument in a coverage memo. Issues that reply a query a metropolis council member requested and no person ever bothered to review as a result of the effort-to-importance ratio was too excessive. Diff-in-diff is the most well-liked quasi-experimental technique. It’s straightforward to do and it’s related in a number of conditions. If the marginal price of doing it’s zero, then aren’t we heading to the identical place that computing easy averages and making line graphs went — one thing you simply do, as a result of the info is there and the price of not doing it’s the similar as doing it. Couldn’t diff-in-diff and instrumental variables simply turn into family names if the marginal price of doing them is zero? I imply why wouldn’t I run an artificial management when deciding on whether or not or to not have pepperoni pizza at my children’ little league get together? Why not current shift-share outcomes on the PTA assembly? I imply these are low worth makes use of, and now the price of doing them is zero, so … gained’t they occur?
We constructed a whole graduate curriculum round strategies that took months to implement correctly. The shortage of execution time was itself a screening system — solely questions deemed vital sufficient bought studied. That filter is gone. What replaces it?
Is the discovering true?
Right here’s the place it will get unusual for me. I now have a consequence. The result’s this: hashish legalization seems to extend weekly wages by about 2.2 p.c. Employment results are null. Overdose mortality reveals no discernible change. I’ve this graphic.
The info will not be hallucinated. It’s from the BLS and CDC WONDER. The methodology is boilerplate diff-in-diff with robustness throughout 4 estimators with the distinguished one being Callaway and Sant’Anna. The occasion research, significantly for wages, present clear pre-trends. It’s not dissimilar from papers I’ve seen revealed on this literature.
However is that this consequence true? Ought to I imagine it? Ought to I learn the manuscript it wrote? Ought to I now work on this mission after I by no means got here up with it, and if I do work on it, what am I obligated to say and do? Do I coauthor it with Claude? Am I first writer? Am I Claude’s analysis assistant, not the opposite approach round?
It’s one factor to do these on a substack, type of showcasing what may be accomplished, however it’s fairly one other factor to drop every part and barrel head first into doing this, letting Claude Code shake you round so that you’re focusing now on the mission it got here up with. Now it truly is the blind main the blind. And the truth that this may be drummed up so simply, with no effort — what do I make of “findings” now?
Al Roth as soon as mentioned empirical work exists for 3 causes: 1) to ascertain details concerning the world, 2) to inform theorists their fashions could must be tweaked or deserted, and three) to whisper within the ears of princes. So right here I’m with a consequence produced by a machine. Is that occasion examine plot a truth? Or does it solely turn into a truth if it’s revealed in a peer reviewed journal? Is a imply a truth then when it’s not revealed in a journal? If I pull down my financial institution assertion, and produce a piechart of spending, is {that a} truth even when it isn’t revealed? Simply what’s information now if it’s produced with no effort in any respect and but I don’t suppose to ship it out? Do I even must ship it out? Ought to I take this occasion examine plot and inform a theorist their mannequin is improper? Ought to I present this occasion examine to a state legislator contemplating legalization? Or do you solely present occasion examine plots to state legislators when the examine is revealed — through which case what then we do concerning the unpublished NBER working papers that newspapers deal with as revealed on a regular basis?
At what level does the dialog shift from policing who wrote the paper to asking whether or not the discovering is true? And what makes it true? Is it true as a result of it’s true or is it true as a result of somebody verified it and advised me it was true? Do I must learn this paper? Will I learn this paper? Will I burn this paper? Will I vow by no means to work on this subject as a result of Claude wrote this paper?
As a result of whether it is true — if hashish legalization actually does improve wages by way of labor market formalization and felony file removing — then does it matter that no human designed the examine? The info is actual. The strategies are credible. The consequence both holds or it doesn’t.
However how do I do know it’s true? Do I want the peer evaluate to say it’s a truth? Did I want the peer evaluate earlier than it was revealed to imagine issues? It’s all very unusual and irritating partially as a result of there very nicely could possibly be regardless of how we slice it a large avalanche of “purported details” coming, whether or not to the journals or whether or not to our lives, in such quantity that missing a verification system, how will we sift by way of it? We could have extra submissions than slots. It may create main bottlenecks in an already very lengthy publication course of for all of us. Is the time to publication going to go up now? I write a manuscript and now I’m standing in a really lengthy line behind 10,000 robots?
However I caught the bug as a result of I’ve spent twenty years studying this
I did catch a Solar-Abraham aggregation error, although. You possibly can see it within the video. One thing smelled off as a result of it was a lot totally different than the others, and it ought to have been practically similar to CS. I knew to drop the Vera incarceration information after I noticed the protection gaps. I used to be nonetheless concerned.
However the factor is — I solely know to try this as a result of I’ve been exerting myself for years on studying causal inference methodologies, significantly diff-in-diff. I’ve written a e-book about it and its sequel. I’ve revealed and taught till I used to be blue within the face. So I see issues.
However there’s one other diff-in-diff coming. Will I see it then?
And one other after that. And one other. The free-rider downside cuts each methods right here. If the machine does the work, what incentive do I’ve to maintain investing within the human capital that lets me catch the bugs? My abilities depreciate. Or worse — they by no means get created within the first place for the subsequent era. At what level does the work under the marginal price curve simply occur, no matter whether or not anybody is certified to confirm it?
After which there may be the paper the machine made for me. Ought to I learn it? Ought to I work on it — search to enhance it, discover it extra, push it additional? Do I contact this? Does it make me a foul particular person if I do? These are questions I don’t fairly know what to do with as a result of I’ve by no means requested them earlier than, by no means had to consider them earlier than.
The velocity is the factor. Even if you happen to can say “I don’t suppose these are making AERs, so I’m not going to fret about it” — that doesn’t tackle the truth that they might already be making papers appropriate for JHR or JOLE or the Journal of Legislation and Economics. They may be. And that modifications quick. And no person has accomplished a clear check. And the individuals providing their professional opinions have issues to lose.
The potential is there. However so is the free using. And so are the questions I don’t have solutions to.
I believe all of us discover out quickly.




