TITAA #40: Post-Creative Creativity
Panorama-orama! - Museum Games - Translation - SF Writer Podcasts - Uncreative LLMs
Sol Lewitt (wikipedia) was a conceptual artist, perhaps best known for his “wall drawings”—which are in fact not painted by him at all. They are a series of directions he has written, prompts, if you will, that are then “performed” on walls in museums by various human artists. Or are they draftspeople? Sol’s take is that “each person draws a line differently and each person understands words differently.” Only their particular understanding makes the implementation interesting.
Last month, Amy Goodchild, a well-known algorithm artist, prompted ChatGPT to produce p5.js code according to his (slightly modified) instructions. The results were generally…. not that good? But her post is very funny and worth a read, with commentary like “Breaking out of the square format for reasons known only to itself, ChatGPT implores us to contemplate questions like, ‘why use p5js inbuilt width and height variables when you can create your own, much longer ones?’ and, ‘why should you draw two lines if you only feel like drawing one?’”
I’ve been reading about poetry translation and thinking about copyright. A haiku in Japanese— “Furuike ya/ kawazu tobikomu/ mizu no oto” (Bashō)—might be considered a set of instructions, which have been variously executed in English as:
The old green pond is silent; here the hop Of a frog plumbs the evening stillness: plop!
Once upon a time there was a frog Once upon a time there was a pond Splash!
… both of which are considered “bad” and the latter of which this paper suggests expresses contempt for frogs. The authors’ translation is simply: “The ancient pond/And a frog jumps in/The sound of water.”
For a particularly poetic take on the creative challenge of translating poetry, this essay by Willis Barnstone is a humdinger.
Translation is an art between tongues, and the child born of the art lives forever between home and alien city. Once across the border, in new garb, the orphan remembers or conceals the old town, and appears new-born and different.
Copyright as a construct exists to protect income and requires a “modicum of creativity” (source). While clearly creative, evidently translation occupies a “gray area” of copyright, and there are still papers arguing things like “postpositivist theory conceives of originality and authorship as zero-sum concepts, hence positioning the translation and the original, the translator and the author in an irreconcilable relationship” (source). Essentially, how can there be dual authorship of an original and derived work?
Sol Lewitt gets the credit for the wall directions, not the artist implementers; but then, CEOs make millions and the workers who build their companies don’t. Perhaps the wall painters are considered “work for hire,” and in a copyright world, cede their stake? Or we just like stories about art that are simpler and have a single hero creator.
We like the story of the Beltracchi forgers, or I sure do: they created missing works “that skillfully imitated the styles of deceased European artists including Max Ernst, Fernand Léger, Kees van Dongen and André Derain.” They sold them for millions, lived modestly, and evidently it was only the reputations and vanity of the art assessors and auction houses that suffered. (Wolfgang is doing NFTs now 🙃.)
That work showed more than a modicum of creativity. Wolfgang translated style to new work. But it was fraud that got them sent away. As a side note, Cory Doctorow has a good piece on how copyright hasn’t actually protected the income of artists, as CEOs and distributors rake in profits off their work. It has, however, made it harder for artists to create. To sample, to remix.
Start of Feb, there was a thoughtful article on “postcreativity” and the AI art evolution by Pau Waelder. It’s a version of their essay in this downloadable edited collection, The Meaning of Creativity in the Age of AI (which I haven’t read all of yet). This piece tracks historical computer-aided artists and their takes on code as collaborator. Jan Løhmann Stephensen (paper source) suggests the terms “postcreativity”:
Through the lens of postcreativity, we can consider artworks as the outcome of an interaction between a variety of actors, including humans, objects, systems, and environments. In AI-generated art, this means taking into account all the people, animals, natural environments, institutions, communities, software, networks, etc. that take part, more or less directly, more or less willingly, in the artwork’s making.
From the Spider Robinson story on copyright cited by several this month: “Artists have been deluding themselves for centuries with the notion that they create. In fact they do nothing of the sort. They discover.” In the latents, in the process of using tools to refine their ideas, in the creation of shitty first drafts that don’t achieve their goal yet and need rework. In working from direction.
I will wrap here by recommending the interview with Ken Liu on The Gradient podcast which covers translation as a creative act, the use of technology—understood writ broadly—as integral to our creation and conception processes, fiction and art as personal. More on this below in the Story Generation section.
TOC for content (these will be links on the web):
AI Art Tools (Panoramas and ControlNet, NeRFy, Video, Misc Tech and Art)
Things I Think Are Awesome is a reader-supported publication; I take off work to write this each month. If you value it, please consider paying!
AI Art Tools
3d Pano Scene Generation - last newsletter, I posted the prompt trick that sometimes works in Stable Diffusion to generate equirectangular panoramas that can be viewed in VR/3D sphere viewers. A month later, and this is a new industry. Right after publishing that, a new site launched from LatentLabs, and then last week, Blockade Labs launched Skybox with high quality images with a nearly invisible seam (but it’s there). You can download these images, which is 💋. LatentLabs has now released a LoRa SD model trained on CC0 panoramas to better generate the equirectangular shape (usage thread); and Julien Kay made a Unity tool to turn Skybox output into depth-based renders using MiDaS (example usage). Blockade is doling out API keys, which OMG!
🔥 In other panorama generation news, WOW detailed composition of large panorama images with Mixture of Diffusers by Álvaro Barbero Jiménez: “Each diffuser focuses on a particular region on the image, taking into account boundary effects to promote a smooth blending.” HF demo! Great git repo! A paper! This is an extreme example from his rep to show off the blending of many content and style blocks in one image:
Related, MultiDiffusion’s pano code worked for me out of the box, and there is a nice demo on HF now too (default image width is 4096!). The part that allows scene composition hasn’t been released yet. This one will kind of tile your request across a lot of the image space, rather than make one wide view instance of your thing, as seen in this segment:
ControlNet is, yes, the latest incredibly useful toolset in Diffusion generation land. We knew the image2depth model with Stable Diffusion 2 was useful for grounding new images and doing blender bumps, but now we get an integrated toolkit of preprocessors for doing PoseNet, depthmaps, normals, scribbles, etc. It’s huge. And can be batch applied to video frames to restyle them as in this example output from the HF demo thanks to fffiloni; and here’s a YouTube tutorial on doing it with Auto1111 by Olivio Sarikas. If you read down the repo page, many tips and integrations can be found. (For a comparison of ControlNet output and the similar recent T2I-Adapter, try a medium post by Catmus.)
For some exciting examples of using ControlNet, I’m featuring Bilawal Sidhu: by day a Senior PM on Google Maps, where he worked on the 3D map features that were my favorite part of the Google webcast this month—apart from my own project, of course. He has twitter-posted and documented his Indian room re-design and then his NeRF-based street scene, both with tip walkthroughs. He has a YouTube channel and a Substack with tutorial info, and there’s an interview/coverage of him here.
LumaLabs’ NeRFs are getting fun. NeRFs are “neural radiance fields,” a neural network method of using photos (or videos) to construct novel views in 3D. I tried one from a video I shot in very low light conditions in Paris, so it’s blurry but it looks art-styled and I love it. You upload your video and they process it (eventually) into a 3D mesh with several view options and downloads.
Latent Blending: for smooth, barely perceptible transitions in generated videos, I also posted this last month, and now it’s even better (with upscaling too). Johannes Stelzer added a HF demo too. I love my results, but there is a very professional beautiful example by Vanessa Rosa and Xander Steenbrugge here 🤩.
More video-related: The first edition of Stable Digest from Stability is very good and has an interview with AI video artist Rémi Molettee (IG link).
RunwayML’s video tool Gen1 led us to more Simpson’s intro redos, like Paul Trillo’s excellent example of cubist stop-motion, plus an AI Film Festival.
Fast Stable Diffusion built for M1 Macs (open source, with UI), from the amazing Hugging Face. (I mean, just a thousand cheers for them, right?!)
Compare captioning models here in a HF demo. BLIP (from Salesforce, checkpoints) is supposed to be the winner these days but ymmv depending on the image you try.
Code coming soon, a speed improvement on Dreambooth that will make it essentially “real time” and open up applications: “Encoder-based Domain Tuning for Fast Personalization of Text-to-Image Models.”
Various ways to serve Keras-based Stable Diffusion.
“Mauritshuis hangs artwork created by AI in place of loaned-out Vermeer”. (Girl with a Pearl Earring, and yes, they knew what they were doing.)
One provocative point from the Waelder piece: “What should be asked is not if AI can create art, but whether the art created by AI is worthy.” An academic study finds it is not judged worthy at least as a sole author: Anthropocentric Bias in the Appreciation of AI Art. “The same artwork is preferred less when labeled as AI-made (vs. human-made) because it is perceived as less creative and subsequently induces less awe, an emotional response typically associated with the aesthetic appreciation of art.” But as a co-author, I think things might differ.
ProcGen and Data Vis
Kevin Schaul of WaPo on replacing MapBox with cheaper OS mapping tools. The giant serverless tile delivery (PMTiles) by index is amazing. Fab post.
PyGWalker - Turn your Pandas DataFrame into a Tableau-like visualization. (Via Data Science Weekly newsletter.)
MagicCanvas for pixel-streaming webgl, invisibly switch between local and remote. Solves scaling issues?
SwissGL, a wrapper on top of webgl js. Demos page. From Alex Mordvintsev.
Geopipe.ai is offering downloaded 3D maps of the real world. Like for games. Huh.
Fractal Cities, a post about doing 3D cities from mapzilla/Craig Taylor. Doing artistic dataviz with Houdini. I was previously a fan of his “Domed Cities” renders.
Motion Canvas Camera is a cool component for Motion Canvas that lets your camera follow paths and objects.
Games and Story Gen
MarioGPT Mario level generation (with playable HF gen demo). This is super cool because of the translation of level layout to language model training strings (I’m trying to do something similar right now!). Also see this Sokoban level generation paper by Todd et al.
Next-Gen Web Games. What can be done with the latest tools? Here’s a sample.
Write a First Person Game in 2KB with Rust. Pretty simple graphics, lots about ray shading.
Using MS Flight Simulator’s unpublished API to control a plane with python and JS by Pomax— this is amazing (and detailed). Sadly there is no access to the graphics pipe thru this method, so no screenshots or movies.
Archaeology mode is finally showing up in Minecraft 1.2.
Miniworld is a “minimalistic 3D interior environment simulator for reinforcement learning & robotics research.”
Mould Rush game by Raphael Kim (h/t Philip Vollet): “a biotic game that involves slow interaction between humans and living mould, mediated through the internet. As the microbes grow and propagate, their growth patterns are live-streamed online, where the gameplay takes place. Every game, or ‘campaign’, is played in a completely unique landscape, formed by a seemingly-random nature.” Er, Last of Us vibe?
A treasury of Zork maps from Andrew Plotkin.
Using ChatGPT/LLMs in games, especially for NPC (non-player character) interactions: “AI Dialogue is Here, Can We Use It in Cool Ways?” and the Mount and Blade mod that uses ChatGPT. Met primarily by game writers/designers saying things about needing to be able to control the world state and story better; while I agree, there is also a giant effort in the tooling around prompting LLMs to enable structured memory, parsing, knowledge bases (e.g., LangChain, the best known). Hire an NLP person! 😊
Story Plot Prediction (paper by Huang et al, code coming). “We create a system that produces a short description that narrates a predicted plot using existing story generation approaches. Our goal is to assist writers in crafting a consistent and compelling story arc.” They use the Booksum dataset I posted about before, which links source text to summaries at different levels (I knew it would be a huge asset for work like this). Also see previous work with code on semantic frames from plot chunks.
A new tool I haven’t tried yet (on the waitlist), a node-based story generation app called Jotte. Looks kind of Twine-like in its UI.
As expected, a boom in AI-written books on Amazon self-publishing (including children’s books, of course) and now there’s “AI Generation is Flooding Literary Magazine’s But No One is Fooled” (Verge). Includes the sad story of sf&f mag Clarkesworld now shut for new submissions. I imagine the editors feel the way I felt as a teacher hand-grading Python projects when the students were cheating and didn’t even rename their variables.
John Scalzi also says there’s no real competition for his job yet because LLMs aren’t good at creative writing yet. I mentioned Ken Liu’s interview on the Gradient podcast: Ken also talks about feeling dis-satisfied by the AI model assistance during Google’s Wordcraft writers’ evaluation experiment, a feeling most of the participants shared.
Stories By AI substack experimented and rejected much of ChatGPT’s help:
“Perhaps unsurprisingly, ChatGPT tends to output predictable and bland writing, and it’s not easy to steer it in the direction of something that better resembles your “voice”. My take-away is that it can be good for edits that don’t require it to do anything interesting, but otherwise tools that are specifically aimed to help authors with AI (such as SudoWrite and Laika) are far more useful.”
Ted Chiang’s much-shared post on AI as a “blurry jpeg” of the internet gets good to me at the end, on the process of creative writing:
Sometimes it’s only in the process of writing that you discover your original ideas. … Your first draft isn’t an unoriginal idea expressed clearly; it’s an original idea expressed poorly, and it is accompanied by your amorphous dissatisfaction, your awareness of the distance between what it says and what you want it to say. That’s what directs you during rewriting, and that’s one of the things lacking when you start with text generated by an A.I.
NLP and Data Science
Minichain from Sasha Rush, another LLM (large language model) prompt framework, like LangChain. And for browsers: Promptable is a Typescript lib for interacting with LLMs featuring embeddings, model providers, chaining, etc. But this is a really good pushback on these tools: “There are so many "Prompt-Ops" tools and I'm sold on none of them,” in thestream.
Open source ChatGPT efforts: Colossal-AI to train your own, and LAION’s OpenAssistant project. Yay, go!
Sumgram, for finding (and constructing!) the most common n-grams from a corpus. I.e., you don’t have to go thru all 1, 2, 3, … grams yourself. Via hrbrmstr’s techy gems newsletter.
PEFT - Parameter Efficient Fine Tuning using LoRa, producing results comparable to full fine-tuning LLM models but with consumer accessible GPUs and CPU offloading. Can be used on Stable Diffusion too. Another Hugging Face joint!
Use Gradio demos on HuggingFace as APIs. This is very cool and time-saving!
FlexGen: Running large language models on a single GPU with a ton of cleverness. Work in progress, “FlexGen is mostly optimized for throughput-oriented batch processing settings (e.g., classifying or extracting information from many documents in batches), on single GPUs.”
Meanwhile, Phil Schmid continues his excellent series on deployment, Flan-T5 XXL on Sagemaker.
Jax 101 tutorial from DeepMind folks.
Internet Explorer by Li et al—terrible name—this training process uses web searches and self-supervision to expand a dataset. I think we’re going to see more and more of these live connected bot models. 🤖 Also, check this case of Replit’s code model asking for an image to help debug something. Eeeee!
★ Children of Memory, by Adrian Tchaikovsky (sf). Really good if different 3rd installment in the Children of Time series, which I recommend all of. Now we have talking crows (corvids), who are essentially stochastic parrots; and a lot of discussion on what sentience means and “copy” behavior that is very timely. Plus, temporal and folklore elements. Strong rec.
👉🏿 Related: I highly recommend this Ezra Klein podcast interview with Adrian about alternate intelligences in his books, AI (and ChatGPT), and creativity. They touch on games, generation of cliche content for tv and books, and practicing art. Ezra does NOT think AI models can be “creative.” For more on these topics, see the Games and Story section above; and you should also listen to Ken Liu’s interview. This is the Spider Robinson story much cited (both Cory Doctorow and Ezra Klein).
Emily Wilde’s Encyclopaedia of Faeries, by Heather Fawcett (fantasy). A curmudgeonly professor goes to a remote northern island (very Icelandic) to study stories of their hidden folk. She is followed by an annoying handsome colleague who may not be human (you know the romance is coming, but it’s not the main point). The folkloric elements and faeries themselves are really well done, and the writing is quite nice. Totally enjoyed.
The Bird King, by G. Willow Wilson (fantasy). Unusual and fun: a slave concubine and a gay mapmaker who can draw fantasy lands into reality flee the Spanish army and Inquisition. There are fantastical daemons and magic and legends throughout, plus some blood and torture (CW).
51, by Patrick O’Leary (sf). A very strange non-linear book (stop it, everyone!) with accounts of aliens and a door between worlds and invisible “imaginary friends.”
Cold Water, Dave Hutchinson (sf/spy). This is the first one I’ve read of his altered Europe spy series, and I will read more. But this had a lot of non-linear tricks and I’m here to say, “Stop it people, making it confusing isn’t better anymore.” I enjoyed the weirdly tradecraft-full story in a Europe with small pocket universes, though it took a long time for the sf elements to show up.
Blackhaven is a free 3d game in which a black museum intern does various tasks around the museum of a historic (fictional) Virginia plantation. She tests the quiz app and listens to audio, scans and reads documents, and finally discovers some missing pages that become quite personal. I played it straight thru for 2 hours—I do love museums—and although the start is a bit slow (while livened up by her sarcastic commentary), the mystery at the end is intriguing. The project is the brainchild of James Coltrain, a professor at UConn (article), who is making a sequel now via his studio Historiated games.
Ib (the 2022 remake of 2012 game) is an evidently influential 2D puzzle story game about a little girl lost in a creepy museum with a surreal, haunted exhibition and many traps. It’s horror. (Thanks to Andrew Plotkin/Zarf for the tip.)
VR: Unbinary is a cute game with a Portal-esque theme and vibe: You’re being directed by an AI voice to solve various room escape puzzles on a space station. Some of the other robots are helpful. It’s entirely hand-painted and the look and feel are lovely. Available for Quest 2 as well.
★ Last of Us (HBO). What can I say… the acting, direction, guest stars, and story have been great. This post by Jesse Mostipak is right on about the horde scene with Joel shooting fungi from above.
Mandalorian s2 (Disney+). I wanted more of Pedro Pascal escorting magical kids, so I am catching up. It’s not as deep as Andor, but I am still totally engaged by the universe and critters.
Poker Face (Peacock). Natasha Lyonne as the wise-cracking woman on the run who can tell when someone is lying and solves murders after you’ve seen them happen. Unusual setup, does it well.
Happy Valley s3 (BBC). I was glad to see this depressing story about drug-and-mob ridden Yorkshire come back. 😄 Ryan wants to have a relationship with his “Dad,” the psycho Tommy Lee Royce, and grannie mustn’t know about it. It’s very tense with the usual excellent acting. A good finale and close, and how often can we say that? (CW: suicide history, rape history, very bloody.)
Old shoes, old roads— the questions keep being new ones. Like two negative numbers multiplied by rain into oranges and olives. - Jane Hirshfield (from here)
Another long one…. I’m considering doing a mid-month release of a shorter list, for paid subscribers only, to reduce the length at the end of the month. If you stuck with me this far, make sure you see this guy make a house for a happy frod.
Best, Lynn (@arnicas on masto and twitter while it lasts, so, maybe next week?)