TITAA #29: Initiation Wells and Image Stories
DALLE 2 - Story Telling With AI Art - Postviral Fiction - Very Good Reads
I spent half a day in Sintra, Lisbon, on a recent work trip. The tourism office had a fab exhibit on myths and legends of the region, full of multimedia exhibits and recordings of folklore. One account was a brief aside about a mysterious well in nearby Quinta da Regaleira, a well of 9 levels connected by a descending spiral staircase, used for weird initiation rites of some kind. I looked it up after coming home. According to an article (actually) titled, “These Mysterious Initiation Wells in Portugal Are Kind of Creepy”:
Built in 1904 by a man named António Augusto Carvalho Monteiro, the place resonates with its owner’s fascination with the occult, masonry and alchemy. Symbols of Rosicrucianism, the Knights Templar and Freemasonry are hidden throughout its spectacular architecture.
The well’s levels and stairs are numbered such that they’re thought to be related to Tarot mysticism, but it’s hard to find any real details, at least in English. Even weirder, the floor at the bottom of the well features a compass rose overlaid on a Knights’ Templar cross.
But wait, there’s more! Wikipedia reports, “The park also contains an extensive and enigmatic system of tunnels, which have multiple entry points that include: grottoes, the chapel, Waterfall Lake, and ‘Leda's Cave,’ which lies beneath the Regaleira Tower.” The underground tunnels connect the initiation well to another well, called “The Unfinished Well.” I guess they didn’t get the floor done on that one. I didn’t have time to get there, and it sounds like it’s definitely worth a visit and further reporting next time I’m in the area.
When I tweeted the name of the article above, Ronny Khalil illustrated it for me with Disco Diffusion and a bit of Gustav Doré style. I love it.
AI Art News / Code
DALL-E 2
If you are paying any attention to AI art tools, DALL-E 2 has made a splash even with a very limited set of test users. (See MIT Tech Review article, and this useful explainer blog post from a developer who worked on it, Aditya Ramesh. In particular, his explanation of un-CLIP is good.) DALL-E 2 is very good at compositional image semantics — following a recipe to construct a picture — but may be less good at being artistic “by default” than other models, such as Midjourney’s. (If you are just joining me, I wrote at length about Midjourney last month.)
David Orr wrote a good article on it here, trying out a bunch of stuff to put it through the ropes, and various Twitter people are doing so (e.g. @hardmaru, @Merzmensch). It’s not great at people in all cases. It has a tendency to look meme-y and cartoony sometimes? Probably because it was trained on purchased image assets — they are being very cagey about the data set contents. Is OpenAI’s business goal, as a MS-funded operation, to create clip art on demand for Powerpoint? Or are they trying to figure out what the business model is, like so many in this space?
In any case, with a linguistic challenge which is grammatical but semantically anomalous, the LING 101 “colorless green ideas sleep furiously”:
Midjourney, in contrast (before upscaling/detail) — seems much more artistically inclined by default:
However, Merzmensch has a glowing review of the output, even from vague prompts. in this article. I have tried some of the same prompts in Midjourney, to often quite different effect. For example, for his prompt “the truth about the beginning of the world” — DALLE output shown in the upper left— Midjourney produces very different symbolism (the other 3):
If you want to see the popular generations of “a baroque painting of a man who is angry because Starbucks made his drink incorrectly,” head ye to Twitter here (MJ inspired by a DALL-E prompt output from Lapine DeLaTerre).
Here’s an article comparing Disco Diffusion 5 generation and DALL-E 2, by @nin_artificial. As noted on Twitter, the results for DD are very good indeed; and I think they are more artistic than comparable DALL-E 2 output.
Meanwhile, DALL-E Mega, an open source attempt at DALL-E, is in training (by @borisdayma). You can follow the current prediction output on weights & biases. The model checkpoints are live in the DALL-E Mini HuggingSpace demo.
This piece by Nathan Baschez on visual design “vibes” argues that with high quality image generation on demand made easy, corporate site design will need to do new things to signal brand quality and taste.
Other Text2Image Links
Disco Diffusion colab keeps growing/warping: Now with VR mode. (I have not tried that.) There is also a combined version of Latest Diffusion + Disco Diffusion in one place, Centipede Diffusion, by Zalring.
The Streamlit-UI based colab meta-tool from @multimodalart called MindsEye now has several other recent models in it, including Latent Diffusion via the GLID-3-XL model from Jack0. There is also a new PyTTI Tools update and notebook.
You can try the CompVis Latent Diffusion with the LAION 400M dataset, colab and Hugging Face app by @multimodalart. (Note: The HF version has a nsfw-CLIP filter in place, because very simple queries turned up horrible porn results.)
ru-DALLE-Aspect Ratio: This model by Alex Shonenkov is very pretty and produces a kind of watercolor effect, especially the “surrealism” model. There’s a colab and also now a Hugging Face UI made by @multimodalart.
I’m super impressed by the captioning in this Zero Shot Image to Text model demo. I gave it a pic I had generated in Midjourney — i.e., not a photograph, and got a very reasonable caption result.
If you want to stay on top of the text2image research and tools world, @multimodal art’s weekly updates have been invaluable to me.
Story telling with generated images!
I’m really interested in how text2image generated imagery is inspiring storytelling in various forms. I’ve noticed this with Townscaper too (and have pointed to instances of it in previous newsletters). I’ve probably seen more with Midjourney because I spend more time there, but feel free to send me more links with other examples?
Evidently in the early beta Midjourney server, there was a fun game of “telephone” in which people riffed off each other’s image prompts or images; @localstarlight told me (permission to quote),
“When we did it… it was more about taking someone’s previous prompt and modifying or expanding on it. Or gradually telling a story. At one point there was a crazy story about some raccoons who took over Brazil and eventually ended up taking over the entire planet and then spreading out into the solar system to colonise it.”
I’ve been playing some telephone games and it’s a funny collaborative kick. I love how you end up in places, especially image generation corners, where you would never have gone on your own. “Authorship” gets even more muddy here, but maybe in a good way? People are also trying to organize Exquisite Corpse games, of course.
@rob_sheridan is writing a sci-fi horror story using Midjourney generated imagery, narrated on twitter (@volstofresearch) and instagram. (Rob is also doing interactive narrative on twitter via audience votes here.)
I liked this Hugo poem (in French) illustrated and animated with generated images, by Thibaud. (Made with Disco Diffusion’s animation.)
@fabianstezler used Midjourney and GPT-3 to try to generate a kids’ book (tweet thread). He also did a fun thread of annotated “Cthulhu through the ages” imagery. There’s a short DALL-E 2 illustrated book written by GPT-3 here posted by Phillip Isola. Here’s an article on Using DALL-E 2 to Illustrate Stories by Little Martian/@va2rosa. They say:
As an artist who has worked intensively as a professional illustrator, I do have some mixed feelings about these new models. I find them fascinating, mind blowing in terms of expanding creativity possibilities. Especially if combined with the language models such as GPT-3, I see a lot of potential for a new world of independent artists, who at some point will be able to create interactive animated stories that guide the audience through a fictional universe. The AI models could also create custom content based on each user's taste, without a specific artist to guide it.
@bzor made an actual interactive branching narrative about an alien museum, using Midjourney illustrations, IN ACTUAL TWITTER THREADS, here.
Many moons ago (literally) but finally released, I made an interactive fiction game for Google Arts & Culture illustrated using last year’s forms of text2image, like VQGAN and early guided diffusion. I wrote it in Ink. The game takes you through Barcelona’s Gaudí sites and Venice churches and canals looking for a penguin who disappeared from an animal tour group. Thanks to help from friends at GAC (Bastien Girschig and Caroline Buttet) for the web implementation and design, a non-trivial and challenging amount of work.
Also finally released was the paper on VQGAN image generation with CLIP (VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance, by Katherine Crowson, Stella Biderman, Daniel Kornis, Dashiell Stander, Eric Hallahan, Louis Castricato, Edward Raff). Congrats Katherine et al., your open source work on image generation has changed the internet and image creation.
Other Creative Gen Visual Stuff
Samila, generative art generator in Python.
Pixel Art Academy: a pixel art lesson in the form of an adventure game.
A beautiful webgl 3D visit to Persepolis on the Getty site, h/t Nicolas Barradeau.
Claus Wilke has a shader tutorial course up, which will help you make unbelievable generated landscapes like the ones he posts.
Wave Function Collapse in Processing (by solub) (h/t Dan Shiffman).
Games / Narrative / E-Lit
Micro Fiction Games should be back in 2022: “any game (ttrpg, larp, tabletop...) that will fit entirely in a single tweet (280 characters). Games must be actionable! once someone has read the game text they should be able to infer how the game is played and be able to go ahead and play it.” Two previous years are in the archive site run by @james_chip_rpg. (h/t @triptych)
How to get a job in video games (pdf), if you must.
Inform 7 was open-sourced. Statement from Graham Nelson:
The overhaul of Inform had three goals: to improve the clarity of code throughout, increasing consistency, reliability and test coverage; to document its workings in depth, enabling publication of the compiler and its satellite software as “literate programs”; and to build a more ambitious infrastructure for Inform to relate to software other than the project at hand, with a new build manager (“inbuild”) at the top level, and a new intermediate representation called “Inter” together with a new pipeline architecture for linking, optimisation and code-generation.
What Does Your Narrative System Need to Do? Thoughts and musing from the very experienced game designer Emily Short.
The full proceedings of the International Conference on Computational Creativity (ICCC) 2021 are here (I can’t recall if i shared before).
Automatic Story Generation: Challenges and Attempts survey paper by Alabdulkarim et al (thanks, Raj).
Meet Your Favorite Character: Open-domain Chatbot Mimicking Fictional Characters with only a Few Utterances, paper by Han et al.
NovelAI.net is a competitor to AI Dungeon, it looks like? I overheard some good reviews of it, but haven’t tried it myself yet.
Elm Story, in early access. Looks like another visual node drawing tool for designing interactive narrative, but with a JSON export, quality-based narratives focus, and a game engine? Worth exploring (I have not).
Interesting first edition of an online journal for electronic art/poetry, the html.review. This is the kind of experimental work where you regularly wonder wtf is going on and what you are supposed to do (if anything), and that’s ok. I love “today we saw” by Anna Garbier and Lan Zhang, which is a tour of image alt tags seen in a day’s crawl, with a washed out image overlay of the image they were attached to.
Martin O’Leary’s new Pangur tool for visual node programming with text transforms.
There are some Large Language Models (GPT-J) fine-tuned on science fiction and fantasy books, thanks to Ryan Micallef for finding these: KoboldAI’s Janeway and Picard models.
As I said above, you can play my new interactive fiction game from Google Arts & Culture, illustrated with AI-generated art, here.
NLP and Data Science
Seriously, a full length ArXiv article on Null Island, the data artifact location at 0, 0 lat/lon. (I love this.) It is called: “‘I think I discovered a military base in the middle of the ocean’ — Null Island, the most real of fictional places” (Levente Juhasz, Peter Mooney).
A Review on Language Models as Knowledge Bases, position paper and survey by AlKhamissi et al.
Tutorial materials on doing NER (named entity recognition) pipelines in SpaCy by Ben Batorsky.
SpaCy ClausIE (“ClauCy”)- information extraction in SpaCy. You gotta love a readme that says you can extract facts from a sentence like: “A cat, hearing that the birds in a certain aviary were ailing dressed himself up as a physician, and, taking his cane and a bag of instruments becoming his profession, went to call on them.”
Hugging Face is offering a deep reinforcement learning class with a repo of materials here. I imagine this is partly due to Thomas Simonini joining them!
Book Recs
⭐ Ammonite by Nicola Griffith (SF). Really enthralling. The men who landed on planet "Jeep" are dead, due to a virus, and only some women from the crew have survived it. Due to infection fears, the “Company” has left them there in a military camp alongside the indigenous locals (all women). An anthropologist comes down to test an antiviral vaccine and learn how the women there reproduce without men. Her travels and relationships are epic and gripping. (CW: two reported scenes of violence that are hard to take.)
⭐ Sea of Tranquility by Emily St. John Mandel. (SF) A time travel investigator tries to find out how multiple people across time could have heard a violin playing from an airship terminal, in a time distortion event. ESJM also wrote Station Eleven and The Glass Hotel, and this book brings back some characters and plot elements from those, but it’s not really a sequel. There is, however, a virus that has killed many people on Earth. I loved this book as a small gem.
Before the Poison by Peter Robinson. (Mystery) A guy buys an old house in England and investigates the life of the woman who lived there who was executed for murdering her husband. I found his personal life dramas a bit detailed and uninteresting compared to the epic historical story, but I enjoyed it.
Mickey 7 by Edward Ashton. (SF) Mickey is a repeatedly cloned “expendable” on a space mission to a frozen, inhospitable planet populated with giant killer bugs. He gets sent on all the risky jobs and gets regrown when he dies. Complications ensue when he ends up duped, because he returns from a mission after being reported lost. Because Mickey is a former historian, we get flashback stories on all his “deaths” and how he got here. It’s a good page turner with more reflective meat than I expected.
Marina by Carlos Ruiz Zafón. (YA?) A school boy meets a beautiful girl living with her ailing father in an old Barcelona mansion. They end up in a Gothic mystery involving animated puppets, awful revenge, magical elixirs, bodies in fires, women in black veils visiting cemeteries, underground tunnels, etc. Way over the top but atmospheric fun. They could have made good use of an initiation well.
Blacktongue Thief by Christopher Buehlman. (Fantasy) A good romp with thieves, assassins, krakens, goblins, giants, big war birds, and truly excellent witches. Plus a wonderful cat (who lives). Snarky page-turner.
TV Recs
Severance (Apple TV): I did love it, it’s good SF and a good attack on CEO worship, infantilizing corporate cultures, and grief. I just wish it had been a miniseries with a solid ending already.
Slow Horses (Apple TV): If you like British spy shows, this is solid fare if full of cliches. Also for spy fans, I recommend The Courier (movie) on Amazon Prime, based on a true story.
Our Flag Means Death (HBO): Gay pirates, what more can I say!
Station Eleven (HBO): I barely remember the book because apparently I have zero plot recall anymore, but I gather this ends more happily. The virus parts are still hard to watch, since we get all the flashbacks of how everyone got where they are in Year 20. It’s a small world, too. But rec, if you can tolerate post-viral-apocfic.
Game Recs
I’m really liking NORCO, a pixel-art text-adventure set in a rotten southern Louisiana town. I’m not a huge fan of pixel art normally, but the writing is superb. I’ve been traveling IRL and Midjourneying and haven’t had much time for games, sadly.
You can play my short cute game, though!
Poem
Spring is like a perhaps hand
(which comes carefully
out of Nowhere)arranging
a window,into which people look(while
people stare
arranging and changing placing
carefully there a strange
thing and a known thing here)and
changing everything carefully
spring is like a perhaps
Hand in a window
(carefully to
and fro moving New and
Old things,while
people stare carefully
moving a perhaps
fraction of flower here placing
an inch of air there)and
without breaking anything.
— e.e.cummings
Thanks for reading. It’s, uh, been another month. At least the weather is better now and I’m sitting in a cottage in Beaujolais, finishing this up. If you have to work on the weekend, do it somewhere with a view.
All the best, Lynn / @arnicas