TITAA #47: Authenticity and Control

Digital Errors - DALLE3 - Control, the Game - Trees - Ghost Stories - "Bugs"

Sep 30, 2023

*Timelapse photo of the Sycamore Gap tree on Hadrian’s wall (source). RIP lovely tree.*

This wandering collage intro is about the opportunities and disasters we get with digital image tools that are meant to augment our daily lives. That lovely photo above was possible because of a technical trick, time lapse, capturing what the eye can’t see in real time. I’m too sad to talk about the destruction of the tree, but because it was so well photographed, I hope we can visit with it in other forms in the future. Not the same, but still.

When CDs came out, I knew audiophiles who could tell the difference between analogue and digital and disliked their crispness. People still collect vinyl records for reasons. There are a wealth of other digital examples, but for some the digital unreality is getting even more obvious and troubling in our daily tools. Lots of folks are annoyed by simple image processing, exemplified in this tweet from Jon McKellan, who is quick to point out this algorithm presumably predates the current AI super-res and generation models. “What is a HDLMMODD anyway?”

twitter screencap of bad zoom on hollywood sign — *Tweet from Jon McKellan (h/t Alex Champandard)*

What if one of the smarter AI algorithms filled it in with the expected text, the way super resolution handles it? He wouldn’t have noticed or known, I expect. But there are more noticeable things being sold as “real” that very much aren’t. Yesterday’s “RealFill” project from Google and Cornell also has a few challenging claims. “RealFill is able to complete the image with what should have been there.” This example went from a cropped picture (I overlaid it in the white border) to an infilled “authentic” picture, using real images as context. Love that red stamp! It’s authentic because it could have been taken, but wasn’t?

Pic from the paper showing generated "real" image — *Example edited from their demo video at RealFill page*

And then there’s realer than real: yesterday

tweeted this work, NeuRBF, which features a video of a zoom-in on Girl With a Pearl Earring, showing the step into hallucinated detail seamlessly, invisibly. Obviously there are no pores in the original painting, instead there are cracks and brush strokes. At some zoom level, we move from the painting into a new work of imagination.

my screencaps of painting and neuRBF contrast

After grad school, when I had RSI from typing and used a speech recognizer, I used to get errors I called “speechos.” Recognizing “sat” as “fat”, etc. Tools for that are much better now, but like autocorrect when typing, far from perfect, especially with the “wrong” accent. Here’s another audio helper: the “voice filter” on TikTok that tries to make out noise as voice works on random sounds, including cat meows (h/t Chris Albon & Doron Adler). Pets and household objects can murmur at you—with very weird voices—as if human: the aural equivalent of “HDLMMODD.” Is saying “meow” and “woof” more “authentic?” You decide: Hashtag #voicefilter link on tiktok.

Back in the realm of visual tools, here’s an Engadget piece on Apple’s new AI lighting to improve portraits. Better than real?

AI lighting in Edgadget — *Source: Engadget*

“Taking advantage of the camera's multi-focus feature, it can separate you from the background and blur it out, as before, but now uses AI to examine the contours of your face and ‘light’ you in a variety of flattering or dramatic ways.” They claim it’s not a “filter”: “These aren't filters, this is real time analysis.” The distinction being the image is adjusted as it’s taken, rather than after? Authentic!

👉 For a good article on the use of vision algorithms in surveillance and control, read down to my review of Control, the game.

On to the other tech and book news… There’s a lot below, including “controllism” and the paranormal game Control, a bunch of new splatting tools, my test of DALLE-3, a good article on an artist working with a custom image model, a ton of great fiction reads on AI and horror, tutorial NLP notebooks, and an excellent poem. Please subscribe if you haven’t yet, it allows me to keep doing this: it takes a ton of time!

TOC (will be in-page links on the web page):

AI Art Tool News (Controllism, 3D Gaussians, DALLE3, Articles & Misc)
Games-Related, Emergent Bugs in Agents…
NLP & DataVis
Book Recs
Game Recs, mostly Control
A Poem

AI Art News

“Controllism” (I think a term from Fabian Stelzer?) was the fad of the past few weeks with Controlnet-QRcode based spirals, text, and optical illusions all over social media. (I linked to the original source in my last newsletter.) You have to look askance or blur your eyes to make out some of them, especially the crypto text ones. Tools appeared on the new glif.app (and overwhelmed it!), the Huggingface demo site for Illusion Diffusion was pounded, Krea.ai posted a bunch, there’s one on ArtBreeder now, and even Pika Labs video gen is supporting “encrypted” images (link to ex-twitter). It’s work to get a good result, though, on all of them: pick your input structure image carefully (white on black, black on white, some work ok with color…) and weight it differently depending on the prompt.

*Morocco, on glif.app, by Marine Castanié*

I duplicated the HF illusion space to experiment with different weights and prompts, and these were some of my favorite owls:

*Images created by me with source owl on Pixabay and HF space IllusionDiffusion*

I almost prefer the upside-down generation trick Alex Carlier came up with although it’s harder to perceive. It’s clever and effective. You give two prompts, in the same style, and it creates an image that can be reversed to reveal the second image. This is me doing “a watercolor mountain scene” (left) and the reverse of it shows the other prompt, “a watercolor tropical island.” Yes, it’s one image. HF demo here made from his colab.

*Reversible image, two prompts: mountain scene, tropical island scene.*

3D Gaussian Splatting

I woke up to 2 new articles and an online product demo already. The method is similar to NeRFs, but much faster. You can see a comparison of the two methods on this tweet from Hugues Bruyère.

The online product demos is at poly.cam. You need to upload between 20 and 200 image files to get your splat. I maybe uploaded the wrong images or not enough images of this dolmen I saw this summer, but I was still extremely thrilled to see it so quickly. I do like the painterly glitchy look of these, I admit! I’ll mourn when it’s perfect.

*Gif (I reduced size a lot) of my 3D gaussian made at poly.cam.*

There’s also DreamGaussian. Their project page shows their meshes being animated in Mixamo. You can try a colab with a gradio UI (thanks to camenduru and John Whitaker). I tried a “fluffy white owl” but it came out as quite a rainbow chunk and with the Janus problem (too many faces). Evidently it works better to go from image to splat. Also see the colab/code for Text23D with GSGen, which is text to splat.

List of Splats: viewers, tools, implementations for using 3D Gaussian Splatting, which I’ve been talking about here. Written before yesterday’s several papers and companies. Tooling is coming along!

DALLE-3

Various insiders are testing DALLE-3 and sharing. I have access via Bing’s Image Creator on mobile it seems (thanks to tips on twitter). This is it saying, “An owl on a branch with a wooden sign saying “I’m DALLE3 in disguise.” The text quality is how I know it’s DALLE-3. Much better, but still not perfect!

I tested some of my favorite prompts, and can attest that the “penguin in a carnival mask beside a canal in Venice, photoreal” looks like it should. But some other more complex prompts were a struggle. I can’t get any variation of “an asteroid hitting a space ship” on any platform yet that looks correct— even with “a spaceship being hit by an asteroid” and other mixes, wordings, etc. That concept seems to be hard. Here are some Wizard of Oz prompts and their DALLE3 ouput variants, from great to less good.

*DALLE3 outputs via Bing Image Creator, my prompts below*

Prompts, left to right:

A yellow brick road going into a dark forest (perfect! all of them were)
A dark castle with monkeys flying around it on bat wings, hd photorealism (otherwise I got little cartoon logo things, but also accurate!)
A yellow brick road beside a corn field with a scarecrow in it. This prompt tended to produce a lot of extraneous stuff, like pumpkins, cute farm sheds, etc. That pic is the best one. I’m fascinated by the view always being down the road, not from the road into the field.
A wall with an ornate gate of emeralds. All attempts tended to produce giant castles and not just simple pictures of a gate in a wall. The model works pretty hard to fill in the details on what it thinks is insufficiently specified, I guess.

I compared Midjourney and DALLE3 on a few favorite themes and still slightly prefer MJ’s stylistic interpretation and detail. I’ll wait for more general access via ChatGPT to go deeper.

Misc AI Articles and Stuff

‘Is It Good Enough to Fool My Gallerist?’ A good piece in NYT on artist David Salle working with a custom image generation model to try to inspire him. His work and style(s) are particularly difficult, having a collage-like look as well as having changed a lot over time. With AI, his work becomes “a pastiche of pastiches.” The article features Salle’s critique of the custom model’s outputs. He’s also experimenting with prompts from writers, like Sarah French’s cryptic line “Fold up your house, but unfold living.” What does a model trained in his style make of this poetic line?

*Excerpts from NYT piece on David Salle critiquing a model trained on his art*

The model is built by Grant Davis who founded new company wand. (Note that Midjourney has also been promising custom model training for a while now. But they’ve also been promising a web site UI, so.) “Salle remains unsure if the algorithm has truly created art; simultaneously, he shows no signs of wanting the experiment to end.”

Apropos all the image processing algorithms in the intro segment: “The Surveillance AI Pipeline,” by Kalluri et al:

“The studies presented in this paper ultimately reveal that the field of computer vision is not merely a neutral pursuit of knowledge; it is a foundational layer for a paradigm of surveillance. … A purported view from nowhere is always a view from somewhere and usually a view from those with the greatest power.”

Bill Gurley’s video on regulation as market capture is as good as everyone said. I am, needless to say, pro open source.

Confessions of a Viral AI Writer in Wired is worth a read. There’s an initial take on getting a good line from a model, and then the gradual “it’s mostly cliche” realization that has hit all of us trying to use the big models for writing fiction (or poetry). Plus a lot of sad bits about public reaction to AI writing tools. There is a screaming need for stylistic training and personalization. As a reminder, Laika offers personalization based on your own text, and some public domain models. But still no capability to tune on purchased books, due to copyright fears. I am of course happy that the writer’s strike ended on a note allowing writers to use AI tools for assistance if they want, without docking their pay for it or requiring it.

More 3D:

Google Earth 3D meshes in Houdini (code).
SceneDreamer code and model was released, a project that generates unbounded landscape scenes from 2D images, with style and camera movement options.

📽 Animation: Just to note that I’ve seen some surprisingly awesome AI generated animations coming out from people using AnimateDiff with a Comfy UI (that complicated node tool that required many sessions of “let’s figure it out!”). Here’s a video how-to. I haven’t had time to replicate this node diagram. 😅 In other news, research project SHOW-1 is supposed to be a better video generator from text than, eg., Runway Gen2, but no code yet.

Games-Related, Agents, Emergent “Bugs”

Neurocracy 2049 review in Eurogamer. This is the latest fake wikipedia-based mystery game, in which even the edits on the articles are (or may be) clues. “Is any new edit, by itself, true, or misleadingly presented by someone with an agenda? Are contradictory behaviours evidence of surface manipulation in the information presented here and now, or of deeper manipulation happening behind the scenes?” It’s browser based and after reading this, I wish I’d been playing along in real-time. But I’ll still dive in on a long weekend.

“‘They went to the bar at noon’: what this virtual AI village is teaching researchers,” a very brief article in Nature about the Sims-like agents paper by Park et al. Some choice comments on emergent behavior they observed (apart from the coordination of the party):

There were some subtle unexpected behaviours. Some of the agents started going to the bar at noon. I’m not blaming them, because I’ve done that. One of the interesting conversations that we’ve been having is: what is error, and what is not? There are also cases where the agents talk in a very polite and formal manner, because we’re using ChatGPT, which was fine-tuned to behave that way.

For reference, even in a “non-AI” system with deep procedural rules, emergent behaviors can be surprising. This is more or less the goal of procgen systems! Cf. the entire game of Dwarf Fortress, but most specifically, their bugs, or “bugs.” E.g. the recent “Bug 9653: Tavern keeper/performers repeatedly serve alcohol until patrons drink themselves to death.” The commentary is revealing:

over the last 2-3 month there have been 3 alcohol-related deaths in my fortress. the first one was a goblin bard so i thought this may be because he wasnt used to it, but the last victim was a dwarf citizen. im not even sure this is actually a bug as its kinda nice from a flavor perspective, just maybe a bit extreme? or maybe my 2 tavern keepers serve double drinks to the same people?

If you’re into these big sim games and discussion of bugs in them, you might want to check out Exadelic in my book recs. Magic is a kind of “bug.”

A new interactive tv show game thing that sounds very fun: “The Isle Tide Hotel: like Wes Anderson directing a playable episode of Doctor Who,” in the Guardian.

AI & Games Jam 2023 entries are up.

The Game UI database. Super! Screen shots of UIs.

This month I played Control, which I really liked — see Game Recs below for details and some links to related articles/videos.

NLP & DataVis

Intro guides! Teaching etc.

Jeremy Howard’s video Hacker’s Guide to LLMs with jupyter notebook.
A “Non-Engineer’s Guide to Train a LLama 2 Chatbot” from HuggingFace.
Also Meta’s own “Getting to Know Llama” notebook is very good for beginners.

Also for teaching: Visualizing Matrix Multiplication, Attention, and More from the PyTorch team.

Hercules: Attributable and Scaleable Opinion Summarization. Code. I may have posted previously, but I just saw it again. Little parse trees!

Gregor Aisch likes Observable Plot, their plotting lib.

Alvin Chang’s datavis story in the Pudding about loneliness post-pandemic is really good scrollytelling. Here is his “behind the scenes” process and design article.

AI-generated react UI components, maybe? (Not a product/company.)

Book Recs

A minor observation that it’s a trope all over zombie flicks, and also in a lot of these ghost and AI stories, that the humans are worse than the “monsters.”

🤖 We Have Always Been Here by Lena Nguyen (sf). A neuro-atypical young woman ends up inheriting the role of psychologist on a secretive mission to a weird planet; she relates better to the androids than to the humans on board, who are all angry conscripts working off life-time debts. Something weird is happening on the planet: time, gravity, disappearing fractal mountains. The androids are acting strangely too, and there might be a saboteur on board. It’s a nice look at AI friends: if you liked Klara and the Sun, you might like this.

⛤ Exadelic by Jon Evans (sf). If you’re into time travel and simulation theory (“what if we’re living in a sim”) and AI actors, you might like this! One of the AI characters asks at some point, “If this is a simulation, then who’s running it? It isn’t God.” I really enjoyed this. It pulls in a number of paranormal/magickal threads too, including Jack Parsons and Marjorie Cameron. Parsons was the founder of the Jet Propulsion Lab and a practicing occultist. I found the ending uncompelling, but that may be the nature of this beast.

🏔 Beyond the Hallowed Sky by Ken MacLeod (sf). FTL (faster than light) travel is theorized by a scientist, which kicks off a political-scientific mess for her. There are 3 large nation-states, the EU one, the American-Anglo one (roughly), the Chinese-Russian one. They all have AI entities that are core to their functioning, with quite different personalities. It turns out only the EU state hasn’t got FTL and isn’t exploring extra-solar planets, so the scientist ends up defecting and building a ship for them with a scrappy team. Meanwhile, alien rocks are waking up on a bunch of planets, including earth, and all the AIs are interested. There is an excellent robot spy. Book 1 of a series, good space opera!

👻 The September House by Carissa Orlando (fantasy/horror). A very, very haunted dream house… the phenomena get worse in September. Her husband has left, he can’t take it anymore, while the woman who stays is reluctant to give up her lovely house which she can stay in if she just “follows the rules.” But we start getting the history of their marriage, and there were a lot of rules there too. This is not for the faint of heart, it’s very gory: children dismembered, abuse, etc.

🏚 A Theory of Haunting by Sarah Monette (fantasy/horror). A curator has to visit a haunted house to document a collection and try to extricate a donor from a group of occultist kooks holding sceances. Lots of angry ghosts and an angry building, too. (CW: child violence, gore.)

📸 Shutter by Ramona Emerson (mystery/horror). This was a prize winner, about a Navajo crime scene photographer who sees the dead. She is resolutely haunted by one dead victim and needs to untangle the violent death to get the ghost to leave. Interweaved we get the story of her childhood and the Navajo rituals and protections her older relatives enact when they learn of her ability. (CW: gore, violence.)

🐦 Neighbor George by Victoria Nelson (lit fantasy). I read this because it was rec’d on the Weird Studies podcast, along with her non-fiction. It’s very good, if you like creepy: A young woman house-sitting in a hippy North Bay CA town crushes on her new neighbor George. But he’s odd and increasingly threatening; and she has a family tragedy she hasn’t confronted fully that happened here. There’s a grove of trees that she’s inexplicably afraid of. George keeps a big wooden book he says is about her, written in the language of birds. There are lots of drunk, posturing poets. It’s a great weird read and sometimes funny. (CW: violence, bad sex.)

Game Recs: Control, mostly

★ Control: Thanks to recs from friends, I played through Control on the easiest combat settings. If you’re into paranormal weirdness, this is a great game for you. You can check out a good Control review in the Guardian from a few years ago, too.

Outline: Jesse arrives at the Federal Bureau of Control, an office building in New York City where a secretive government agency “manages” (tries to control) exposure and risk from paranormal objects and events. X-Files-ish, sure, but fewer aliens and more alternate dimensions. The building seems deserted initially, in lock-down, and the director is dead in his office. Eventually Jesse finds other employees in hiding who tell her she is the new Director. 😮 She explores the site, trying to find information about what happened. Something extra-dimensional (“the Hiss”) has possessed the bodies of the employees, turning them into mindless combat drones and re-landscaping the building into blocky alternate spaces. She “cleanses” the space by “taking control” and it reverts to the usual brutalist office shape. Along the way she picks up key cards, weapons, and some abilities, like telekinesis and levitation.

*Creepy possessed floaters in the office space (pic via RPS review)*

Throughout the building are redacted memos and videos, records of bureaucratic minutiae (book clubs, leave policies) and power struggles, documentation of incidents called Altered World Events and Objects of Power — televisions, rubber ducks, radios, phones, a magic gun… In true weird fiction trope, the easily overlooked Finnish janitor turns out to Know Things and See Things. The FBC uses alien black rocks that look like the monolith in 2001 as a power source; one of my favorite details is the etched tree on one of them. That monolith has been moved to a barren warehouse floor filled with mist. Gamers think it might be Yggdrasil from Norse mythology, the tree of nine worlds.

One of the best devices (worth playing for this alone imo) is the Oceanview Motel, a liminal magical space that is used as a connector in the FBC, where “the rule of three applies.” She has to pass through it to get to extra secret locations. The sound design here is excellent. Note how they don’t know what most of the locked door symbols mean… they can only ever use the inverted black pyramid.

*The lobby of the Oceanview Motel, Remedy Games*

I was as big fan of the architecture details, it really conveyed soul-less government office. There is a post-script endgame bit that leans hard into the horror of working in this kind of office, almost worse than the monster bits. A relevant aside: I mentioned I’m taking the online AI & Art course run by philosopher of art and the weird, JF Martel. JF assigned the classic Deleuze article “Postscript on the Societies of Control,” which I had actually never read. (Truth: I was actually finishing the game Control when it was being discussed and I totally forgot about the office hours.) Anyway, in brief, in societies of control, individuals are de-individualized and become data bits, or cogs in a machine. You know, like us! The postscript in the game really brought that home. “It’s up to them to discover what they’re being made to serve,” Deleuze says.

Re the building and art design, I rec Concrete and Control, a video about the architecture and related themes in the game. It gives you a fab overview: except they never mentioned the Oceanview Motel which is a real oversight!

VR: I really like jumping and swinging in low-poly Windlands. Not for the motion sick in VR or people afraid of heights! I enjoy the weird architecture and sketch of ruins, even in low fidelity.

Poem: “Self Portrait As An Angel”

Text within this block will maintain its original spacing when published

i will most certainly have wings of roaring flame
eleven of them, seven unstable faces, your worst
shame as my eyes, your darkest desire as
my sword, bloody and oh so beautiful
and i will finally look
like the terrible thing
i have always been

haven't you always
wanted power beyond
sense, haven't you asked for it
in your most sinful nights, sworn to pay
whatever the cost? i am not here to answer
or lend grace, perhaps you do not understand
how this actually works. i think
you should be very afraid.

—Akwaeke Emezi (2023) via Matthew Ogle’s Pome

This made me think of Rilke’s angels in the Duino Elegies, which are terrifying.

If you made it through to the end, thank you! These are always too long, I know. I’d love to hear from you, as usual. And share and rec!

Best, Lynn (@arnicas on the sfka twitter, mastodon, and bluesky, where I have some invite codes if you get in touch)

Sebastian Martin

Oct 19, 2023

Control is a great game, I sadly haven't finished the story, but plan to do so over the next months. The combination of brutalist architecture, liminal spaces, cold war occultism just ticked many of my boxes. Thanks for bringing it up!

Expand full comment

1 reply by Lynn Cherny

The Bird Soup Diaries

Oct 24, 2023

Not sure if you already read MIT’s Tech review, but just saw this article which might be of interest: This new data poisoning tool lets artists fight back against generative AI

https://www.technologyreview.com/2023/10/23/1082189/data-poisoning-artists-fight-generative-ai

2 more comments...

Things I Think Are Awesome

Discussion about this post