Wednesday, February 4, 2026

Expertise and the Break up-PDF Workflow

Introducing “abilities” for Claude Code — what they’re, why you’d need them, and a brand new one I constructed for studying educational papers with out crashing your session

Thanks all a lot for the continued help of this sequence. This can be a continuation of my ongoing sequence on Claude Code for utilized quantitative social scientists. I actually do imagine that 99% of what’s written about Claude Code is by software program engineers for software program engineers, or by pc scientists for pc scientists — and decidedly not by quantitative social scientists for quantitative social scientists. And my perception is that the elasticity of demand right here is very elastic, which means if we will simply decrease the bar to utilizing this expertise good known as Claude Code, then the amount demanded will improve — and doubtless by quite a bit, given how highly effective it’s for our personal respective employee productiveness.

We aren’t within the enterprise of creating merchandise to be purchased and bought on product markets. We’re within the enterprise of manufacturing scientific data that goes into the scientific document, consumed by college students and our friends, hopefully printed, and hopefully correct and in the end truthful. As such, I believe the market will constantly beneath present what you and I want to higher confirm the right way to harness Claude Code and different AI Brokers for the kind of work we do. The instruments are the identical as what the software program engineers use, however the function is basically completely different, and that modifications how you utilize them.

Thanks all for studying, and thanks particularly on your help. That is genuinely a labor of affection, achieved hopefully to assist elevate consciousness concerning the worth of Claude Code by documenting my very own course of. For those who discover it helpful, please contemplate turning into a paying subscriber — it’s on the low, low worth of $5/month, $50/12 months, or on the founder’s worth of $250. Thanks!

Yesterday I launched Referee 2 — a persona protocol the place you open a contemporary Claude Code terminal and paste in a set of directions that flip Claude into an adversarial reviewer of your individual work. At this time I wish to introduce one thing associated however completely different: abilities.

Within the software program engineering world, abilities are a comparatively new Claude Code characteristic, and you’ll discover no scarcity of breathless Medium posts about them. Boris Cherny, the creator of Claude Code, makes use of slash instructions and customized workflows extensively — he runs 5 Claudes in parallel, makes use of Opus 4.5 solely, and has constructed out a system the place workforce data will get encoded into recordsdata that Claude reads routinely. The broader Claude Code ecosystem has an official abilities repository from Anthropic, neighborhood collections, and a rising physique of documentation about the right way to construct and share them.

For those who learn that documentation — and I encourage you to strive — you’ll encounter phrases like “YAML frontmatter,” “allowed-tools declarations,” “subagent invocation patterns,” and “slash-command discovery by way of listing conference.” Which is okay for those who’re a developer. However for those who’re an utilized quantitative social scientist like me, attempting to determine what any of this virtually means for my precise work can really feel like studying a overseas language.

So let me translate.

A talent is a recipe. You kind a brief command — like /split-pdf — from the command line when you’re in Claude Code and Claude follows an in depth set of pre-written directions to hold out a fancy, multi-step process. That’s it. As a substitute of explaining each step your self every time (”discover this paper, obtain it, cut up it into items, learn each bit, write notes…”), you kind one command and Claude Code does the remainder.

The directions reside in a file tucked safely away (.claude/abilities/split-pdf/SKILL.md) that Claude reads if you invoke the command. You don’t have to grasp the file. You simply need to kind the command.

This can be a pure query. I launched Referee 2 yesterday as a persona — which is a time period I believe I kind of invented on this context. My Referee 2 persona is only a markdown file you paste into your subdirectory, have the newly spawned Claude Code learn from a contemporary terminal. Why not make it a talent?

The reply is about separation. A talent runs inside your present Claude session. Referee 2 should run in a separate session. That’s the entire level — for those who ask the identical Claude that wrote your code to evaluation it, you’re asking a pupil to grade their very own examination. Referee 2 requires a contemporary terminal with zero prior context, no prior commitments, no reminiscence of the alternatives it made whereas writing your code. It needs to be impartial.

A talent, against this, is one thing you need this Claude — the one you’re at present working with — to do for you proper now. Obtain a paper and take cautious notes. Generate a deck out of your outcomes. Run a selected cleansing pipeline. These are duties, not adversarial critiques, and as such, they don’t want new personas to carry out them.

So: abilities are for duties you need automated inside a session. Personas are for roles you need Claude to undertake in a separate session. Completely different instruments for various jobs. You wouldn’t use a talent for Referee 2 any greater than you’d open a contemporary terminal simply to obtain a PDF or go to a unique listing.

You want a talent when you end up explaining the identical complicated workflow to Claude again and again. For those who’ve typed the identical 15-line immediate 3 times this week, that’s a talent ready to be written.

You don’t want a talent for easy issues. “Learn this file” isn’t a talent. “Make me a determine” isn’t a talent. If the duty suits in a single sentence and Claude can do it with out elaborate directions, you don’t make a talent — you simply ask Claude Code to try this factor you need achieved straight.

The candy spot for abilities is workflows which can be:

  1. Multi-step — a number of issues need to occur in a selected order

  2. Repeatable — you do that repeatedly, not simply as soon as

  3. Fragile — if Claude misses a step or does them out of order, issues break

Which brings me to the talent I constructed this morning.

I’ve routinely run into an issue when asking Claude Code to learn and summarize educational papers and different massive pdf paperwork. Really, two issues.

Drawback 1: The session crash. Although Claude Code has a big context window for chatting, it routinely “chokes” on massive PDFs. PDFs are token-expensive in methods easy chatting with Claude just isn’t. PDFs use fonts, vector graphics, tables, math notation — all of which should get transformed into tokens, and a 40-page paper can blow proper previous the context restrict. In actual fact, a 40 web page paper is unusually sufficient longer because the pdf, when it comes to tokens, than for those who had written out by hand straight into the immediate! And when Claude Code has reached its restrict on studying that PDF, you’re going to get this lethal message: “Immediate too lengthy.”

When Claude Code tells me that, it’s his final dying breath. This specific Claude chat window can’t be revived (it would repeat that immediate with all the things you say after that occurs, or at the least beneath the present model of Claude Code that I’ve) and should then be closed down and a brand new chat opened in the identical working listing. Which is okay — seemingly no hurt no foul proper? I imply it’s not such as you misplaced the work that you just’ve been doing. It’s nonetheless in there. All of the applications you wrote, all of the figures, all of the tables are nonetheless there even when Claude Code gasps and dies.

Apart from one factor — the belongings you’ve been doing in that chat window that you just had not written down in a progress log markdown die with Claude Code’s loss of life.

Bear in mind — Claude doesn’t reside in between the strains. It has no everlasting reminiscence though it speaks prefer it does. It’s a bit unusual in moments to have detailed conversations with Claude about this venture, it dies from choking on an enormous PDF, you reopen your entire venture once more, the voice of Claude Code is identical, and but it has no reminiscence by any means of something you had simply achieved. Claude Code is selectively affected by amnesia. So it doesn’t bear in mind all of the work you’ve been doing collectively except you’ve been aggressively protecting up to date progress logs as markdowns, through which case it may well learn these and the total historical past, however that’s about it. A “immediate too lengthy” error destroys the session and all of the context you’ve constructed up. As a result of it may well at all times re-read the context as long as you stay in that context window wealthy session.

Drawback 2: The shallow learn. Even when the PDF suits, Claude’s consideration degrades over lengthy paperwork. It reads the summary fastidiously, skims the methodology, however it usually hallucinates particulars from the outcomes. You get a assured abstract that’s subtly fallacious.

So I had a workaround — splitting the PDF manually, keep away from the big and unique PDF, studying solely a “splits” at a time in chunks. However till this morning I had not made it an official talent. Now I’ve. It’s known as /split-pdf, and yow will discover all the things about it right here:

You give Claude a paper — both an area PDF file or a search question like “Gentzkow Shapiro 2014 competitors newspapers” — and it does the remainder. It web-crawls to seek out the article and downloads it to an area articles/ listing (or makes use of your native file if you have already got it). Critically, it by no means deletes the unique PDF as a result of that’s how I’ve written the talent. The unique stays.

Then it splits the PDF utilizing PyPDF2 into 3-to-4-page chunks and shops these splits in a subdirectory named after the article. Then it reads these chunks in small batches — 3 splits at a time, roughly 12 pages — pausing between every batch so you’ll be able to evaluation the intermediate output.

For instance, I had Claude use this talent on Gentzkow, Shapiro, and Sinkinson’s “Competitors and Ideological Range: Historic Proof from US Newspapers“ from the American Financial Assessment (2014). This can be a structural IO paper that builds a mannequin of newspaper entry, political affiliation alternative, and promoting in two-sided markets to review how competitors impacts ideological range — utilizing the 1924 US every day newspaper market as a laboratory.

Every time Claude reads a batch of splits, it has to carry out a focused extraction throughout 8 particular dimensions:

  1. Analysis query — What’s the paper asking and why does it matter?

  2. Viewers — Which sub-community of researchers cares about this?

  3. Technique — How do they reply the query? What’s the identification technique?

  4. Information — What information do they use? The place did they discover it? Unit of remark? Pattern measurement? Time interval?

  5. Statistical strategies — What econometric or statistical methods? Key specs?

  6. Findings — Major outcomes? Coefficient estimates and normal errors?

  7. Contributions — What can we be taught that we didn’t know earlier than?

  8. Replication feasibility — Is the info publicly out there? Replication archive? URLs?

Every time it reads just a few splits, it updates a operating notes.md file with no matter new info it discovered. By the top, you’ve got a structured extraction throughout all 8 dimensions — not a paragraph of imprecise abstract, however particular coefficient estimates, equation numbers, actual information sources with the place they have been obtained, pattern sizes, and an in depth evaluation of whether or not you could possibly replicate the work.

You may see the precise notes it produced for the Gentzkow paper right here. They run to about 320 strains as a result of the paper is methodologically dense. An easier empirical paper would produce shorter notes.

The second purpose I do that, past avoiding the session crash, is my perception that shorter engagements with “digital objects” trigger the gradient decay in lots of sorts of language processing to shrink. Even with transformers, my perception has been that the longer the duty to undertake, the extra possible the hallucinations are, and that though hallucination errors can nonetheless happen, my hope is that by way of repeated extractions, these errors change into far much less correlated on condition that hallucinations are often guesses made and probabilistic in nature.

So by giving Claude a number of possibilities to extract info — studying 12 pages at a time as a substitute of 42 pages without delay — the hope is that in complete I get a fuller, extra correct description of the article. The errors from one batch don’t compound into the following as a result of every batch is a contemporary engagement with a manageable quantity of textual content.

Normally I’m attempting to get a transparent sense of the info and the place to seek out it, however since I’m already doing it, I’d as effectively extract extra delicate particulars from the article — like its exact place within the literature, or precisely which desk has the principle specification, or whether or not the replication archive truly exists.

This isn’t a substitute per se for studying the paper. However it’s a approach to hold cautious, structured notes about papers you’ve got learn. I’ve discovered that typically authors borderline bury key info in footnotes and appendices and I merely can’t discover the reply to the query I’m asking. That occurred to me just lately once I tried to retrieve the info utilized in a particular person’s paper solely to comprehend — after discovering it in a footnote — that I had at all times been wanting on the fallacious dataset (the fallacious month of the CPS, particularly), which was solely made obvious to me by one thing like footnote 19 or whereever it had been.

So now I’m attempting to get all of this surfaced — the info sources, the precise variable definitions, the precise pattern restrictions, the replication feasibility — into one structured doc that I can come again to.

The output from all of this can be a easy markdown doc that Claude has been writing to your entire time — notes.md, sitting within the cut up subdirectory alongside the cut up PDFs. It’s a everlasting artifact. You may come again to it months later. You may share it with a coauthor. You should utilize it to put in writing your literature evaluation. You may even use it to assist create an exquisite deck for you and your coauthors to evaluation later when in your zoom name.

All of that is at the MixtapeTools repo. As I get extra formalized abilities, I’ll put them there. You want solely clone the repo and pull it in if you want it. However strive it out. And check out the rhetoric of decks immediate I discussed yesterday — contemplate making decks explaining what you present in these papers so that you just and your coauthors can scrutinize that info yourselves.

Be aware although — split-pdf just isn’t a lit evaluation. It’s extra like an accessible note-taking course of about work that may produce data from the fabric quick, and sometimes in a format which you can then interact with completely — within the occasion you merely can’t discover your notes wherever, or you have to bear in mind precisely what was in Desk 3 of a paper you learn six months in the past. As a result of it really works in chunks, additionally helps you confirm exactly the place every statistic got here from, the place every assertion was made, the place every equation was listed. It’s like Google’s previous pagerank in that sense, though it’s unique function was merely to cease Claude Code from choking to loss of life on a bone.

Take all the things with a grain of salt. These are workflows that work for me. Your mileage could differ. Thanks once more, everybody! And bear in mind, whereas I can’t compensate you for supporting the substack, I can do that:

Related Articles

Latest Articles