Skynet: Your Friendly Touring Lighting Designer, Part I
I wonder, sometimes, at the sorts of scenarios that military specialists imagine, particularly specialists tasked with dreaming up out-there scenarios like Klaatu landing on the South Lawn of the White House or (perhaps somewhat more likely) something both armed and dangerously experimental escaping from Boston Dynamics. Fun to think about, for sure, but also important. Even if any number of these scenarios is unlikely, this sort of imaginative play is vital to staying agile across a variety of domains. “What if…” is, and has always been, a question that we should ask from time to time.
While not as (militarily) dire as, say, a Skynet scenario, an economically and – perhaps artistically – troubling conversation has been taking place over the past decade or so, which goes like this: “Will The Robots take over all our jobs because they are soulless metal machines having need of neither sleep nor sustenance nor love?” By “The Robots”, these days we of course mean computer programs like expert systems or natural language processing question and answer systems. Automation (and worry that it will take our jobs) is old: mules were tasked with grinding wheat into flour many hundreds of years before the industrialists of yore could have dreamed of the same setup with humans, at scale, and automation remains a reasonable worry. Exponentially-increasing complexity of our hardware and software is the force that has enabled truly worldwide automation. This has displaced a lot of workers in more traditional sectors of the economy, and tends to make people very upset. In some cases, literal robots do the manual grinding that human hands used to do – car manufacturing and assembly being the textbook example – but contemporary examples have focused on far more complex technology like self-driving cars and trucks, fleets of product-picking robots in warehouses, to headline-grabbing examples like the self-driving software of a certain electric car that has a tendency to experience multiple meanings of the word “crash” in unfortunate ways. All this aside, there are interesting questions here, about the nature of work and whether systems like we’re describing will take all our jobs, but that will have to wait for Part II. In this article, I want to consider the potential benefits and utility of such AI systems to our creative work.
For some time, a sort of generally optimistic self-assurance has characterized our biz, at least during the times that I can recall it being a topic of conversation. We have seemed to think that the creative aspect of our industry will provide some amount of insularity against the winds of automation that we all see coming for other jobs. To the extent that there have been examples of computers coming up with original works, they have tended to be of – shall we say…uneven quality. An example would be early attempts at computer-generated music, one of the first is generally agreed to be the Illiac Suite for String Quartet, although this is a piece of music wherein a computer used elements of composition that were chosen by a human composer – so the question as to whether or not it was “truly” composed by a human or a computer is an interesting philosophical one1. By the way, my use of the term “automation” is intentional here and I’m going to use that and the term “AI” interchangeably throughout this article, because we can have all sorts of arguments about whether the term “artificial intelligence” applies correctly here, but those are beyond the scope of what we’re talking about for the moment. Fully sentient science-fiction AI like Lt. Commander Data or Haley Joel Osment are clearly beyond our current technological means. What we’re usually talking about when we speak of today’s AI are systems that digest massive amounts of pre-existing work, and then do “some math” and create something assembled from everything it knows2.
But now, DALL-E and GPT-33 are on the scene, imagining the entire scene for us if you want, far better than has ever been available before; an assemblage of other algorithms so complex that most explanations of how it works veer into nearly meaningless over-simplification, because it’s just that sophisticated. If you’re into this sort of thing, well then, by now you’ve already considered just how comfy that avocado armchair might be, and while quirky generative art is oodles of fun, it seems to me that there are interesting possibilities for our particular creative niche that remain presently unconsidered.
I spend a lot of time trying – with various degrees of success – to be creative in my day-to-day job. Translating a song into a color, or coming up with a stage design based on some text input. This is enjoyably challenging, but also draining, as anyone who has a job in the “creative” industries will tell you. While vast hoards of internet men have digital warehouses full of digital snake oil purporting to open your hitherto unknown troves of creativity with their salves and lotions and potions and snazzy new apps, the secret sauce of creativity has always been, for me, patience. It is an understanding is that there is very little I can come up with that hasn’t been done before, at least in some form, and the best way to chase that creative dragon is by looking at other things and awaiting inspiration. Often, this takes the forms of books and blogs and even movies and TV shows, but what generative AI software (not just image-generative AI) offers is a fascinating tool in our toolkit for imagining The New. If you follow this sort of news at all, you’ve likely seen images produced by these systems. A typical interaction that a basic user has goes like this: you input a prompt (“Bemused Golden Retriever wearing the uniform of a 17th-century Marshall of France”) and the engine assembles bits of pieces from everything it knows about Golden Retrievers and ancient French history, and you get an image that, particularly as these systems improve, produces a unique, never-before-seen image of what you asked for. So far, so cool. The implications for scenic design here are obvious. I’ve been playing around with one of these systems – MidJourney – for a while now, and generated what I think are some really interesting designs for stages that while not completely practical, represent an excellent jumping-off point for someone interested in coming up with a cool design4. Here are a few I’ve come up with, and the associated prompts:
There are very clear applications to be considered here, particularly as regards scenic and set designs and, of course, graphic design as well. Considering only images, however, seems to me to be a very limiting way to use software like this, particularly given the advancements in things like natural language processing and text synthesis, as well as code review. There’s potential here, for a fascinating next step: a mix of audio analysis, generative AI systems, text prompting and the ability of these systems to synthesize text and code, which could lead to a real advance in the world of programming for concerts. The image and text-generative aspect is what seems to me to be the real advancement here, and it nicely ties together all this tech that’s been sort of kicking around at the periphery of the public consciousness. What do I mean by this?
Programming automated lighting, is, at its heart, translating the look and feel of the music by process of creativity and imagination and experience. We base our looks on chopping up the songs into sections: an intro, a verse, a chorus, a tag. These sections are defined by vocals and instrumentals, and tend (if the track is mixed well) to fall into somewhat definable acoustic ranges, along with amplitude and beat changes. With proper training and enough raw example data fed into a system5, I see little reason why a well-trained system couldn’t infer with a reasonable degree of certainty the structure of a previously-unheard track. And once the structure of the song can be identified, things can start to get really interesting.
I’m speculating here – all prognosticating about the future is speculation – but such a system could look like this: after analyzing the audio, the user provides some input about the sort of song they want to program. That input would need to inform the program about the programming style (“Try to interpret musical elements more for timecode programming”, versus “Interpret detected chunks of music as a manual Go cuelist”, versus “Build some busking pages based on this list of tracks, perhaps on a sliding scale) and then need a few other inputs. What such a system could do is apply knowledge of how programming is done via training on automated lighting rigs and songs, and armed with such a library of knowledge, use a sort of inference engine to think like a programmer.
Analysis of the audio could reveal basic facts: BPM, structure of the verses and choruses, detection of instrument parts, and with some user input indicating a color scheme (or using one that the AI comes up with), create a more or less basic cuelist for a song. For instance, let’s say the program detects a beat in the upper-high mids range of the spectrum. It could deduce that the beat is a high-hat rhythm at a weighted -2db average relative to the rest of the song, and through its vast library of programmed experience, conclude that what is called for a subtle dimmer chase that fires upon every instance of that musical element. The user could specify different weights to be applied to different parts of the audio spectrum, for instance, pay more attention to this or that frequency range. An “energy” function could control the weight that the algorithm gives to changes that it hears, while other selectable options could offer differing inputs for how many effects should be generated, or how often movement between focus positions should occur. The system could offer a color-picker-like interface with features such as already exist for web developers and graphic designers: the ability to pick a color scheme using triads, tetrads, complementary or opposing colors, etc6.
Systems like this will need successive iterations to get workable results – but then, so do human beings when they’re working on programming. The most workable paradigm could be audio analysis that shows a quick preview of what its detected together with all events, highlighting them along a timeline together with transport controls so that the user could quickly and easily preview the results before committing the changes. I’m envisioning this as primarily a tool to assist with the often time-consuming process of programming your songs to SMPTE timecode, but it could just as easily be applied to “traditionally” run songs with the GO+ button. Systems like this need not be “monolithic” modules of the lighting software, either: you could use beat detection for timecode without asking the software to pick colors for you, or ask for a blank cuelist for a song that you could fill in yourself, and save yourself a few additional keystrokes.
Where else could this be useful? I foresee lots of different scenarios: rolling into a festival where the artist adds a song last-minute and you need to come up with some looks quickly, or you’re sitting in a studio late at night and need a hint of inspiration because it’s song number ten you’ve programmed today and the showfile needs to be sent in the morning, or whatever. Even without the more advanced options I’m envisioning here, having some sort of advanced beat and section detection ability would be great for programming things to timecode, to quickly align events with a beat or oven to quickly generate events that could then be mapped to your timeline.
Would something like this count as “AI”? Ask ten different people, you’ll get ten different answers. I think that such a system would fall under the category of a “Weak AI” or a “Narrow AI”. It would still be an assemblage of minimally-related components or modules all interacting to solve a single problem, i.e., “program lights given this input”, which – to me – counts as a weak AI. And all this is simply my imaginings, of course. The larger point I’m trying to make is that I think that at least the ability to program this way is almost inevitable, given the initial advancements we’ve seen with natural-language processing and interpreting inputs in a way that “makes sense” to most end users. We as programmers and designers are not immune to the automation that seems to be heading our way, and it raises serious questions about the nature of art, humanity, and the crossroads of technology that occurs between them. In the second part of this series, we’ll examine the existential changes that could be possible for our business.
- This entire article could have been titled “An Ode to Equivocation”.
- “Hmm” the astute reader will intone, stroking their chin. “Isn’t that the same thing humans do?” Hang on Skippy, we’ll get there.
- There is also ChatGPT, which is the text equivalent of these generative AI systems, and which I believe helps make my point about the adaptability of these systems.
- I find arguments about the artistic and “real work” value of such image systems questionable, and further, consider these arguments outside the scope of this article.
- No shortage of this.