Tuesday, March 7, 2023

The Mozbot Mashup: Roger Explores the World of Generative AI Imagery

AI image generation has taken big leaps forward in the last year. It’s fun to play with. It’s a little bit weird. It can produce some mind-blowing results — and often laughable ones.

But is it useful in a marketing context?

We decided to find out, and our valiant SEO robot, Roger, was volunteered to be our first test subject. Don’t worry, he was cool with it. He was actually pretty excited to have a machine intelligence to engage with, after spending so much time doling out SEO knowledge to us simple humans.

Training the model

AI imagery tools like Midjourney, Stable Diffusion, and DALL-E 2 are pretty amazing at creating images of just about anything you can come up with, but they have their own algorithmic and random-noise way of getting there. So while you can come up with interesting results, it can be hard to come up with a specific result.

To get to anything that actually looked like our friendly SEO Mozbot, we needed to train a stable diffusion model to get a start. There are a lot of ways to go about this, some that get pretty technical, and a number of others that use app interfaces to make the process easier on someone with a little less technical expertise.

We chose to start with Astria, a solution which allows you to customize (they call it tuning) a model of your own. A lot of users train it on their own likeness to make cool avatars (like the popular Lensa app), but we threw a bunch of variations of Roger in there, had him party with the AI model, and watched what kind of shenanigans they got up to.

A Rogues Gallery of Rogers

These tools generate images based on a text prompt, so our initial prompt was to see if it could output a version in a fun and colorful 3D style.

Not bad first results! It was clear this generation drew heavily from photos of a Roger toy held in a hand, as well as a photo of our life-size Roger Mascot at one of our Mozcon events (thus, the people in the background of some of the images). These are all actually recognizable as Roger, which I was impressed by, though none of them are quite “right”.

Time to try something in a completely different style. How about “Roger Mozbot with a rocket jetpack and fishbowl helmet, watercolor painting.”

Some super fun results! And others that look like Roger is having a very bad time. Also, apparently the “rocket” part of our prompt gave Roger some hardware in some of the results that made it look like his switch was accidentally set from Hugs to Destroy.

Further iterations produced equally interesting, fun, terrible, and wacky results as we messed around with other styles including more 3D, schematics, children’s book illustrations, and even Anime!

They just keep coming…

Want even more Roger mashups? We experimented further with a tool called Scenario.gg, which is a tool targeted toward creating game assets, but also has a nifty way to train a generator. A bonus of this one is that you can use an existing image as a starting point for a generation, allowing a little bit of additional control in how close or far you hew towards that starting point. Here are some of those results:

If you’re following generative AI, you know it’s an area evolving incredibly fast right now, with new tools, features, and techniques constantly coming out. A couple weeks after the initial generating on Astria, we delved back in and they have a video generating feature now. A little trial and error later, we had a super cool little video of Roger to go with all those pictures:

What have we done?

We’ve put Roger through the AI ringer, but to what end? Sorry Roger, it was all in the name of… SCIENCE! And learning. The initial experimental results came out with a ton of quantity, but the quality was not quite there. At least for reproducing a brand mascot with a specific look but that may not be widely disseminated enough to have been a subject of training on the models. If you are a little less specific with the results you are trying to achieve, AI imagery is already achieving jaw dropping results. Good enough that we are finding other ways to use this imagery in our marketing material, and no doubt you have seen some really cool stuff in your various feeds. For getting a quality version of Roger in a new style or pose, it would be more efficient to have an actual person just illustrate or render the artwork in the traditional style.

As mentioned at the top of the article, this technology is developing rapidly, and it seems like the game is changing every week with new models and new implementations that can make results better. As of the time of releasing this article, we’re already working on a new batch of Rogers using other tools, so look out for a follow up in the near future.

Roger is representative of a software tool that humans can interface with to achieve greater things. Generative AI is a new and potentially very powerful such tool in art, and for our purposes, brand design. Creative and talented people are still needed to guide the process, make decisions, and curate or cleanup the results. So, here’s to humans and robots working together to achieve interesting things! We’ll just have to see where Moz and Roger go with this next.