TITAA #66: What You Hide Doesn't Disappear

PDF Processing - Deepwiki - BookWorld - Game AI - Narrative! - 3D Worlds

Lynn Cherny

May 01, 2025

*Art by Victor Hugo, from the Royal Academy exhibit (I’m in the process of writing it up for my blog)*

Straight to the links so I can get outside! These will be links on the web view.

TOC:

AI Creativity (briefly, mostly 3D)
Web Misc / Procgen / Arty / Fun
Games Links (and a lot of Games AI Links)
Narrative (quite meaty!)
Data Science / NLP / Tools (a lot; classification, small models, etc)

AI Creativity

I’m declaring holiday bankruptcy on some of the stuff I’d usually cover — there is a new alpha Midjourney v7 I haven’t tried yet, for instance. There were a lot of new video model releases, which I haven’t had time to try yet either (Kling 2! More Wan! etc). But here’s a few things:

Audio

I don’t keep a close eye on Audio in this newsletter as you know, but this:

The heroic upstart from nari-labs/dia: A TTS model capable of generating ultra-realistic dialogue in one pass. — got a lot of press. Can do dialogue, and laughter etc.

3D / Generation

NASA Mars Editor via the Epic Store — “The NASA XOSS UE5 Editor provides you the basic tools to begin building your own multi-user EVA scenarios on the Martian surface for the NASA MarsXR Challenge.” It won’t work on my Mac but I have this game PC lying around… This was a few years old but still cool (h/t Bilawal Sidhu).

The Google Earth 3D map tiles are officially usable in Three.js now. It’s really weird watching them render as you zoom in on a scene, they kind of… melt.

Ito Meikyu Is A Digital VR Labyrinth Inspired By Japanese Art History — this looks amazing. Made by Borris Labbé, inspired by The Fukinuki Yatai, The Tale of Genji, and The Pillow Book.

A sudden death of a few new 3D gen tools, damn it’s fast moving:

Sketch to 3d - goact.fun demo. Also supports image input.
Meta Horizon Desktop Editor Can Now AI Generate 3D Assets.
Tencent Hunyuan 3D new one
CommonSenseMachine’s Blender-MCP tooling, also with Mixamo.
Krea AI has a 3d stage experimental feature that tries to make you a whole scene from a prompt, with objects in it, that you can export to Blender. Yes, the Blender export works (see pic below), but the actual scene breakdown and objects are all over the place, because it doesn’t have a great world model behind it yet. I asked for a church crypt with romanesque pillars: I got a bunch of different pillar types, a bunch of different unconnected walls with different textures on them, etc. These could’ve been all copies of the same pillar and wall.

Unirig - Diverse Skeleton Rigging model, with code.

Web Misc / Procgen / Art / Fun

Watch Global & Local Live TV Online for Free. This is pretty useful for folks wanting to practice their language understanding. I have just switched on a French soap on TF1 in the background.

WORD SCULPTURE — a video of strange word mutations and wordplay from the designer of Blue Prince - via mike cook.

Nolen’s One Million Chessboards. I made a website. It's called One Million Chessboards. It has a million chessboards on it. Moving a piece moves it for everyone, instantly. There are no turns. I can’t even visually parse this.

Neural Network in a Loop (h/t Luokai) — “I turned a forest trail near my apartment into a playable neural world. You can explore that world in your web browse. By “neural world”, I mean that the entire thing is a neural network generating new images based on previous images + controls. There is no level geometry, no code for lighting or shadows, no scripted animation. Just a neural net in a loop.” This is a pixelated pretty world in the browser. I found it oddly moving, like seeing memories.

Games

Thousand Lives, an email based interactive fiction experience. (Via Jon Ingold.)

📖 Ultima and Worldbuilding in the Computer Role-Playing Game - an open access book (via the new Bathysphere game-related email newsletter, who are opening paid subs now!).

📖 Evidently the well-reviewed new French-origin Clair Obscur RPG is inspired by a book that Younès did a talk about. Younès Rabii: "La Horde du Contrevent": A Novel That Didn't Know It Was A Roguelike. Totally fascinating, I really need to try reading it in French.

📖 The Video Game History Foundation is slowly putting up Very Very Old PDF issues of Game Dev Conference proceedings (h/t Pippa Brooks). They are massive tomes of 1000+ pages. It’s a fascinating browse.

💽 An Internet Archive data game jam focused on old CD ROMs, Discmaster Jam, led to this Tarot-inspired rumination game Archaeos by Florence Smith Nicholls and Mike Cook. I really like the idea of jotting down thoughts about a collection artifact.

🔧 Animation library for use in games, Quaternius Universal Animation Lib. And don’t forget the wonderful Kenney, who keeps turning out asset packs.

Game AI Stuff

TextArena - open source text games for LLM work.

TextArena is an open-source collection of competitive text-based games for training and evaluation of agentic behavior in Large Language Models (LLMs). It spans 57+ unique environments (including single-player, two-player, and multi-player setups) and allows for easy evaluation of model capabilities via an online-play system (against humans and other submitted models) with real-time TrueSkill scores.

Video Game Bench. Huh, actually really interesting — look at the latest Big models with vision and reasoning trying to play games. It’s not looking great (forget about just Pokemon challenges).

🧙‍♀️ One Spell Fits All: A Generative AI Game as a Tool for Research in AI Creativity and Sustainable Design — using small models. This looks cute. “… AI-native game prototype where the player, playing as a witch, solves villagers’ problems using magical conjurations. We show how, beyond being a standalone game, "One Spell Fits All" could serve as a research platform to explore several key areas in AI-driven and AI-native game design.”

“This AI Is Learning To Create INTERESTING Games” - Game Innovation video via Julian Togelius, about the paper “GAVEL: Generating Games Via Evolution and Language Models.” What is interestingness?

Collaborating Action by Action - This is the project/code for Mindcraft. Maybe I missed their excellent failure states when I first saw it? Worth a look. The agents chatting are really funny, like their hunt for spiders here:

🔧 VideogameMCP - An effort to create MCP coding tools and templates for game coding with AI.

Narrative

A fab set of slides (109!) from Maria Antoniak - Computational Approaches to Narrative. A reminder that narrative spans a lot of definitions and topics. And a look at how to build a story detector.

Finding Flawed Fictions: Evaluating Complex Reasoning in Language Models via Plot Hole Detection. Useful and interesting!

We introduce FlawedFictionsMaker, a novel algorithm to controllably and carefully synthesize plot holes in human-written stories. Using this algorithm, we construct a benchmark to evaluate LLMs' plot hole detection abilities in stories -- FlawedFictions -- , which is robust to contamination, with human filtering ensuring high quality. We find that state-of-the-art LLMs struggle in accurately solving FlawedFictions regardless of the reasoning effort allowed, with performance significantly degrading as story length increases. Finally, we show that LLM-based story summarization and story generation are prone to introducing plot holes, with more than 50% and 100% increases in plot hole detection rates with respect to human-written originals.

⭐️ BookWorld: From Novels to Interactive Agent Societies for Creative Story Generation / Project page with forthcoming demos and code.

We introduce BookWorld, a comprehensive system for constructing and simulating book-based multi-agent societies. BookWorld's design covers comprehensive real-world intricacies, including diverse and dynamic characters, fictional worldviews, geographical constraints and changes, e.t.c. BookWorld enables diverse applications including story generation, interactive games and social simulation, offering novel ways to extend and explore beloved fictional works. Through extensive experiments, we demonstrate that BookWorld generates creative, high-quality stories while maintaining fidelity to the source books.

A very interesting project with code to try… I know a few startups working on this problem! Their list of books tested is diverse, from Solaris to Paradise Lost. Props.

Drama Llama: An LLM-Powered Storylets Framework for Authorable Responsiveness in Interactive Narrative - A Max Kreminski thing I had missed. Storylets means you, game writers. See Midjourney Storytelling Lab page (h/t Lex Fefegha).

Object-Driven Narrative in AR: A Scenario-Metaphor Framework with VLM Integration — this has some strange things in it (this is almost always good). Like, this creepy example — do I need my apartment to be scarier?

Narrative Studio: Visual narrative exploration using LLMs and Monte Carlo Tree Search. With code.

Another text creativity study — with knobs. Beyond Memorization: Mapping the Originality-Quality Frontier of Language Models: “We evaluate the novelty of generations from two families of open-data models (OLMo and Pythia) on three creative tasks: story completion, poetry writing, and creative tool use. We find that LLM generated text is less novel than human written text. To elicit more novel outputs, we experiment with various inference-time methods, which reveals a trade-off between originality and quality.”

Synthetic SimpleStories generated dataset.

Data Science / NLP / Tools

Deepwiki — this might win the award of most useful AI tool for a few months. Use AI on public repos on Github to get an overview and ask questions about them, for free, via the Devin folks. Extremely useful. Try a big one like three.js.

Learning to Attribute with Attention — includes code and a demo showing which citations are actually used in a search/summary report. You highlight the text in the result, and get a sidebar with what was used. In some of mine, only certain articles are used over and over, which causes me some concern about bias in responses.

Small models and citations, too…

Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family — We introduce a new generation of small reasoning models for RAG, search, and source summarization. And their lib: GitHub - Pleias/Pleias-RAG-Library: Python library to use Pleias-RAG models.

And in other small model news, the Unsloth optimized, easy to run locally Phi4 collection. And a tutorial on fine-tuning Gemma with Firecrawl help.

Document processing/OCR stuff:

GitHub - VikParuchuri/surya at foundation — a new release version of the multilingual OCR doc processing model is in the works (an alpha).
GitHub - ucbepic/docetl: A system for agentic LLM-powered data processing and ETL. Includes the ability to iterate on your pipeline commands.
Build a Multimodal RAG with Gemma 3, LangChain and Streamlit - YouTube — tutorial.
SpaCy’s document processing — Ines Montani has some useful slides on how to do PDF pipelines using SpaCy and Prodigy.

There’s been a lot on multimodal RAG recently, feel free to drop me a note if you want more.

Classification:

Classifier Factory | Mistral AI Large Language Models : “We designed a friendly and easy way to make your own classifiers. Leveraging our small but highly efficient models and training methods, the Classifier Factory is both available directly in la plateforme and our API.”
Efficient Inference for ModernBERT Classifiers Using vLLM – Daniel van Strien

Introducing the Search Arena: Evaluating Search-Enabled AI | LM Arena. “Based on 7k human votes (03/18–04/13), Gemini-2.5-Pro-Grounding and Perplexity-Sonar-Reasoning-Pro are at the top, followed by the rest of Perplexity’s Sonar models, Gemini-2.0-Flash-Grounding, and OpenAI’s web search API models.” Hopefully LM Arena can get back credibility after the recent gaming the leaderboard Llama drama.

System Prompts: A couple of repos claim to have the system prompts for all the big models, one being here, another here and there are probably more.

Another MCP solution to getting up-to-date docs for tools into your prompts in Cursor or any other AI tool, Context 7 MCP. (No, I haven’t tried it. I am using Firecrawl.)

Wikipedia structured data on Kaggle, so you don’t have to scrape for it anymore: Wikipedia Structured Contents.

A Poem

Text within this block will maintain its original spacing when published

Never to see ghosts? Then to be
haunted by what is, only—to believe that glass
is for looking through, that rooms, too, can be empty,
the past past, deeds done,
that sleep, however troubled, is your own?
Do the dead lie down, then? Are blind men blind?
Is love in touch alone? Do lights go out?
And what is that shifting, shifting in the mind?
     The wind, the wind?

No, they are there. Let your ear be gentle,
At dawn or owl cry, over doorway or lintel,
theirs are the voices moving night toward morning,
the garden’s grief, the river’s warning.
Their curious presence in a kiss,
the past quivering in what is,
our words odd-sounding, not our own—
how can we think we sleep alone?

What do they have to tell? If we can listen,
their voices are denials of all dying,
faint, on a long bell tone, lying
beyond sound or belief, in the oblique
last reach of the sense through layers of recognition. . . .
Ghost on my desk, speak, speak.

—Alastair Reid

Happy May Day/Beltane, —Lynn (@arnicas on mostly bluesky, ex twitter, mastodon).

Johannes Lackner

May 1

Hi Lynn, thank you for (once more) a great collection of links and topics. BookWorld sounds intriguing (and like sth that was bound to happen). You say you know a few startups working on the same problem - are you free to tell us (a bit) more? And yes, more Multimodal RAG news would be very welcome! Have a splendid 1st of May

Expand full comment

1 reply by Lynn Cherny

1 more comment...

Things I Think Are Awesome

Discussion about this post