TITAA #38: Dark Feet and Dark Wings
Forest Monsters - ChatGPT - Blobs - Coding LLMs - Oxford Words - Widgets
I bought myself a copy of Charles Fréger’s book Wilder Mann, photos of pagan European costumes. This one above shows Krampus monsters from Bad Mitterndorf, Austria. Krampus is the bad side of Saint Nic, he punishes and scares bad children. There are whips involved. For lots of other fab monsters, do see this page of Fréger’s which has a load of them.
The “12 Days of Christmas,” from December 24 to January 5, and often Advent itself, are often associated with spirits and monsters. Yule can be connected with the folklore of the Wild Hunt, the hunters being the souls of the dead, ghostly dogs, or fairies. “These days aren’t part of ordinary life,” says Fréger’s book. We need to protect ourselves against the dark.
Speaking of which, I read some creepy books about monsters in the woods this month, and the BBC podcast treatment of Susan Cooper’s “The Dark is Rising” is on point and excellent. See Books below!
My annual round-up of best books, games, and TV of the year was sent out the week before Christmas to paid subscribers only. I wanted to keep the length on this one shorter by doing that separately. Feel free to contribute if you like my recs and want to support my labor: the $50/year rate works out to a coffee a month. I drink a lot of coffee. Now, onto the linkfest and my takes on the latest tools!
TOC (I have to add this after publishing for the in-page links to work):
I guess it’s ChatGPT month! And, in general, OpenAI GPT 3.5. I’ve dabbled with it for some programming help, and tried various creative exercises, plus one client project, and my take is this:
Undeniably, the longer prompt context is a huge win, and the results are better for many instructions you give it. (But the API is often erroring due to overload right now, which is annoying for paying customers.)
The chat interface and context-aware sessions seems to have helped with virality, over the usual playground UI (smart move).
For programming help, it’s sometimes useful, but longwinded and didactic. OTOH, you can cut and paste code via their widgets, which is useful. So, if you want tutorial-style, verbose help, it’s not bad? But otherwise, it’s easier to use Copilot (as a VSCode plugin) for in-context help, although that doesn’t do as good a job at understanding your big picture goal. See more below on Copilot in the DS links section.
The poetry it produces is poor. It can rhyme now—and in fact will even when you ask for free verse— but it’s trite: lots of “the future will be better” stuff. It can’t handle poetry styles terribly well.
The creative prose is also a bit mediocre, although it does better at some author styles than the poetry. For very short stories that are reasonably well-formed, it performs pretty well. I’d like to be working in a heavy editing context with models like this, and I think this is now feasible.
It can do fun AI Dungeon-style turn-based text games (e.g., prompts here and here), and I love the hacks people have come up with to add images to their sessions using markdown and Unsplash. But Unsplash is a pretty weak image provider for this content. Here it is trying to illustrate a creepy woods scene with characters around a fire (& notice how generic the text is, despite being told to be Lovecraft):
Here’s an explainer from Hugging Face on RLHF, “reinforcement learning from human feedback,” the method used for ChatGPT training.
For all the hot takes proclaiming this is “the future of search” — well, people use search engines for timely information (movie listings, sports scores), directions, products, local info, actual links to things (tutorials, articles)… There is a research and design literature on “information foraging” and search behavior; a single chatbot isn’t going to replace all of this, IMO. A conversational chatbot is even going to be annoying for some queries. I think Gary Marcus’s recent post is a great summary of some issues here. Lots of search problems still want a recent and regularly updated database.
Meanwhile, the AI-writing startups keep happening; here’s Orchard. I liked this piece: “Co-writing Fiction with Generative AI, with Charlene Putney.” This has many good nuances. I see generative AI tools as creative aids, not replacements, in both the image and text realms. I hope we all become excellent editors of machine first drafts. (ETA: This article in Wired on working with AI writing tools is absolutely my vibe here, I forgot to include it before.)
Karlo, trained with the unCLIP method like DALLE-2, is pretty good at following image composition directions. Model release and demo linked here.
The Auto-Photoshop-StableDiffusion-Plugin, to use Automatic 1111’s SD tool with Photoshop directly (by Abdullah Alfaraj).
Custom Diffusion from Adobe is supposed to be much faster for tuning diffusion models. And the resulting models are smaller. Train “given a few images of a new concept (~4-20). Our method is fast (~6 minutes on 2 A100 GPUs) as it fine-tunes only a subset of model parameters, namely key and value projection matrices, in the cross-attention layers. This also reduces the extra storage for each additional concept to 75MB.”
Using the new depth2image in Stable Diffusion 2 — a thread from Justin Alvey (and KarenXCheng) on using it to generate different room designs from block objects arranged like a bedroom.
A new release of Carson Katri’s blender add-on for Stable Diffusion, dream textures. This has a lot of additions including using the depth model. Meanwhile, here is a site to generate textures (poly.cam, they do lidar and photogrammetry tools).
I was surprised by this ability of Midjourney v4 to generate a palette for you (sometimes)! I have not checked if the colors are accurate to the images—certainly some are missing below. We can ignore the attempt at hexcodes :)
Music generation via Riffusion. This is deeply strange and wonderful in its indirectness: Using generated spectrograms to generate music.
Other Arty and Procgen Stuff
Taper #9, a collection of low-fi small (2KB) text and literary arts pieces via Nick Montfort. My favorite is “Qitty” by Jim Gouldstone. I also like “Blackbox” by Vinicius Marquet and poem-builder game “Lives” by Ala Meyer.
“Particle Lenia and the energy-based formulation” by Alex Mordvintsev et al, with demos.
A beautiful article on the font design by Riley Cran in medieval game Pentiment (one of my favorite games of the year). The fonts are the voices of the characters in this game, and are cleverly animated.
Concept: A “technique that leverages CLIP and BERTopic-based techniques to perform Concept Modeling on images. … Thus, Concept Modeling takes inspiration from topic modeling techniques to cluster images, find common concepts and model them both visually using images and textually using topic representations.” It uses Wordnet to label the clusters. I’d want to compare to other clusterings of image embeddings to see if I am convinced it’s better, but points for the extension of topic modeling…
Blender: EasyBPY, by Curtis Holt, an easier way to use the blender API in python (this is a level on top of it).
Zach Lieberman’s Atlas of Blobs - I will not lie, I like this because it has cool text too. “I asked ten artists, designers, researchers, and visual thinkers to pick one of the blob forms I’ve made and write a text to name and describe it.”
Speaking of blobs, Tim Biskup’s videos of drawing blob creatures with a piece of graphite on IG are mesmerizing.
Data Science and NLP
My talk for the machine learning Normconf — slides here, and video (shorter than the slides content), and growing repo of code and references. I covered making an interactive UMAP of text embeddings for various purposes, entity recognition and linking/deduping as a problem (a couple solutions for tools to use), and using rules in a spaCy classifier pipeline to back up your model. Plus some life-saving tips I’ve found useful recently, that I wish I’d had time to cover. My tips include:
Use the code search on Github (new, but very helpful for very specific searches).
Specialized search engines like Metaphor that are trained on contexts around links: can help you find tutorials, overview articles, research papers…
Light the Torch, a lib to help you install the right torch and torchvision
Copilot for help programming: Yes, while finding it often useful for boilerplate, I do know about the lawsuit and am watching alternate project BigCode. You can try the new SantaCoder demo (JS, Java, Python) on Hugging Face. Here’s an interesting look at Copilot internals. And this work from Google on a Python LLM sounds great but I guess isn’t out anywhere?
CHEESE from CarperAI is a Gradio feedback (labeling) tool. This is pitched as an open-source tool aimed at “reinforcement learning from human feedback” (RLHF) datasets, the way InstructGPT and ChatGPT were trained.
This search paper (“Precise Zero-Shot Dense Retrieval without Relevance Labels” by Gao et al., the HyDe system) is very clever and amusing: Use a generative model to create fake documents in response to a query, embed them, and then use the hypothetical document embedding to assist retrieval of real documents.
“Prompting Is Programming: A Query Language For Large Language Models,” by Beurer-Kellner et al. I am way into this idea of a query language for prompt building even though I initially laughed because this is how we get jobs for prompt engineering. I can’t find a repo yet?
A tutorial by the excellent Phil Schmid on tuning Flan-T5 for dialogue summarization.
EMNLP 2022 (empirical methods, nlp conf) papers are all in the new Semantic Scholar skimming tool to read. You can filter by topics like “story” for some interesting ones. I’m not convinced by the highlights in this tool, so far. But it got me to sign up for Semantic Scholar as an organization and discovery tool.
Applying ML: A bunch of resources (from Eugene Yan) on ML in practice including code patterns for ML systems, which I think has been expanded. I am personally a bit depressed by the not-new sentiment “be more end-to-end” in which we are supposed to be good at everything, because having too many people on a team means communication bottlenecks and, uh, I guess management. I don’t disagree, but. The code patterns parts are really useful.
Reacton — react-based ipywidgets for Jupyter notebooks. I struggle with widgets, tbh, I don’t know if this is better yet. But this sounds like a good vision statement:
“Instead of telling ipywidgets what to do, e.g.:
Responding to events.
Attaching and detaching event handlers.
Adding and removing children.
Manage widget lifetimes (creating and destroying).
You tell reacton what you want (which Widgets you want to have), and you let reacton take care of the above.”
Here’s a Medium post by Maarten Breddels about this project.
Meanwhile there is ipysigma, a Jupyter widget for sigma.js network rendering.
“Creating a Dashboard with Gradio for Real-Time Bigquery Data,” a tutorial.
Tesseract.js — a pure JS port of the OCR lib Tesseract, supporting more than 100 languages.
Datasets in Hugging Face are now getting Parquet format, which is very good news for big data tool folks. Here’s an Observable notebook demo using the duckdb client in Observable.
Games and Narrative Generation
BOOKSUM, a collection of annotated narrative texts for long-form summarization (Kryściński et al, paper). Addressing the issue that a lot of corpora for summarization are news or otherwise non-fiction. “Our dataset covers source documents from the literature domain, such as novels, plays and stories, and includes highly abstractive, human written summaries on three levels of granularity of increasing difficulty: paragraph-, chapter-, and book-level.” This is an amazing resource for more than just summarization (although I think it’s only Project Gutenberg content, which means style and genre limitations). Er, interestingly, there is also Narrasum, by Zhao et al, aiming at a similar goal.
“Flavor Text-Generation for Role-Playing Video Games,” a recent dissertation by Judith van Stegeren. She looks at procgen and GPT-2-based ways of creating in-game flavor text.
“An AI Dungeon Master’s Guide: Learning to Converse and Guide with
Intents and Theory-of-Mind in Dungeons and Dragons,” by Zhou et al. Another RL (reinforcement learning) text modeling approach, focused on inferring goals. “[The] DM: (1) learns to predict how the players will react to its utterances using a dataset of D&D dialogue transcripts; and (2) uses this prediction as a reward function providing feedback on how effective these utterances are at guiding the players towards a goal.” A theory of mind for the players improves the outcome.
“DOC: Improving Long Story Coherence With Detailed Outline Control.” By Kevin Yang et al. This is an improvement over their Re3 system (from last month or previous month), which is actually quite cool. I tried Re3 and was interested in refactoring it to make the modules work separately with more user control. Can’t wait to try this!
One Thousand and One Nights: An AI-collaborative story game where your words become objects in the game and can be used in fights. Seems like something Hiddendoor might be into! They have a paper, “Bringing Stories to Life in 1001 Nights: A Co-creative Text Adventure Game Using a Story Generation Model,” by Sun et al. They used a text generation model trained on fanfic behind the scenes.
“Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey,” by Wang et al: “(i) we present a systematical taxonomy regarding how existing methods integrate structured knowledge into story generation; (ii) we summarize involved story corpora, structured knowledge datasets, and evaluation metrics; (iii) we give multidimensional insights into the challenges of knowledge-enhanced story.” It looks fab! What happened to my vacation for paper reading?
“Generating Knowledge Graphs using GPT-J for Automated Story Generation Purposes,” by Dani. A thesis in the Mark Riedl empire of knowledge-based and constrained story generation, this work expands content missing in ConceptNet using GPT-J. I’d like that content, please.
“Towards Inter-Character Relationship-Driven Story Generation,” by Vijjini et al. If 2 characters hate each other, that changes the story, right?
“Godot RL Agents” is a “fully Open Source package that allows video game creators, AI researchers and hobbyists the opportunity to learn complex behaviors for their Non Player Characters or agents.”
Videos from AdventureX are up here.
The BBC podcast of The Dark is Rising is great. Good 3d audio!
The Dictionary of Lost Words, by Pip Williams (Literary fic). A fictionalized but historically-based account of working on the Oxford English Dictionary; told from the perspective of a young woman who worries about the words that aren’t going in, and the speakers who use them (poor illiterate women). At the time of suffrage protests and the edge of WW1. Thought-provoking if often sad, a good read for linguists.
★ Babel, by RF Kuang (fantasy). So much wonderfulness about linguistics and translation, a fab magic system using the tension between words of proximate meaning in two languages, and a deeply emotional look at colonialism and racism and the requirements of revolution. Plus! Alt Oxford, for a setting.
★ If You Could See the Sun, by Ann Liang (YA fantasy). So good! I read it in two sittings. A young scholarship student who is working her ass off to be top of her class at a Beijing private school suddenly discovers she can turn invisible. She gets help from her chief academic rival, Henry, to offer spying services anonymously to her wealthy classmates. The romance is obvious but cute, the writing is sharp; the anger Alice feels over the class and wealth differences she contends with makes this spicy. Plus, those food descriptions!
Patricia Wants to Cuddle by Samantha Allen (fantasy/mystery/horror). A reality show kind of like The Bachelor films on an island off the coast of Seattle, where lesbians live and hikers disappear in the woods. There is a lot of amusing interpersonal drama among the contestants and staff, sort of weakened by a fast ending.
The Dark Between The Trees, by Fiona Barnett (fantasy). Creepy Moresby Wood has stories of witches, monsters, and missing people. Academic Alice finally gets permission (after 20 years of being laughed at) to take a group into the woods to look for remains of a historical disappearance. Things go wrong. We get both the historical story and the modern expedition’s. I liked this a lot for the academic desperation and atmosphere.
Invisible Things, by Mat Johnson (sf). An expedition to Jupiter finds a city in a bubble on Europa. This is very good social satire, and the two camps on the space craft — the boys’ club asshole “Bobs” vs the two black scientists — are really well drawn. The recognizable politics of the city bogged me down a bit, though, but the ending was pitch perfect.
The Cloisters, by Katy Hays (mystery?). Reasonable mystery about a young art historian getting drawn into a search for a Tarot deck at the Cloisters in NYC. The staff relationships become… strange and obsessive.
★ Pentiment. Fantastic. I loved the local folklore, Roman ruins, old books… I found the long history and mysteries of the townfolk vs. the abbey surprisingly entrancing. I was genuinely moved by the bits of Andreas’s marriage that we saw. The narrative remains somewhat linear, I think, despite many of your choices (you have a number of mysteries to solve), but I am still debating checking online for details I didn’t uncover in my playthrough.
The Case of the Golden Idol. A pixel-art detective deduction game. This has been my holiday puzzler. You are presented with some frozen pictures with clickable areas holding clues. It helps to take notes.
Dwarf Fortress, the Steam version with new sprite art. I like the art! I’ve also turned into a Twitch-game-watching person as I watch other people play this. It’s just so deep. The response to the release by game buyers and the press has been so gratifying. The WaPo article about it was particularly good, situating it against other generative world games: ““Dwarf Fortress” is a storytelling engine as much as it is a game, spitting out associations and facts and details that you can shape into a coherent and specific narrative.” (The Polygon article is good because of concern for a depressed dwarf.)
VR: Wanderer is a good narrative puzzle game in VR. You are trying to solve a bunch of little puzzles scattered across different times, using tools you find along the way. It’s especially good that there is a little guide who can give you hints. Some of the puzzles are opaque enough to really need them. I’m surprised I haven’t heard more about this game, it’s really fun!
Poem - “To Know the Dark”
To go in the dark with a light is to know the light. To know the dark, go dark. Go without sight, and find that the dark, too, blooms and sings, and is traveled by dark feet and dark wings.
I hope your holidays contained some non-work periods, and many books and movies and games. May your 2023 be better and more creative! Mine really needs to be. Thanks again for your support in reading and sharing this newsletter.
Best, Lynn / @arnicas