Experiments with Midjourney Character Images

Experiments with more Midjourney character images, with examples from rural Africa and characters with disabilities

I’m continuing to experiment with Midjourney character images. I’ve had some initial success with using Midjourney to create character images. For fairly typical office workers (which is a lot of what I use for scenarios), it works pretty well. I’ve been testing out some other ideas as well, with more mixed results. Specifically, I tried generating some images for localizing content for Africa and for characters with disabilities. I found that generating characters with assistive devices like wheelchairs, forearm crutches, and hearing aids was quite challenging.

In this post, I’ll share my prompts and results working with Midjourney character images. You get to see some successes and a bunch of failures from my experiments.

If you haven’t read my previous post explaining how to create consistent character images in Midjourney, it may be helpful to read that first.

Images of Rural Africa

Abdulkalim Sezirahiga asked a great question on LinkedIn in response to my previous post about creating consistent character images.

Interesting! Thanks a lot Christy Tucker this is helpful ??

One more thing as an Instructional designer based in Africa, it is sometimes hustle to find something like this, with the same real images that are localized.

Meaning images of the rural area environment and etc…

Any idea if Midjourney, can help?

-Abdulkalim Sezirahiga

I was curious too, so I tried this simple prompt using Abdulkalim’s location:

A woman standing outside the door of her home in rural Kigali City, Rwanda. Photorealistic
4 images of a woman standing outside a home in Rwanda, generated in Midjourney

I don’t know enough about Rwanda to judge if these are reasonably accurate. Abdulkalim says they might work.

Wow! That’s not bad. I think if it can produce such realistic images, there are definitely some tasks it can assist with that don’t require highly detailed images.

My worry with generating images for African audiences is that the source content used to train the models is WEIRD (Western, Educated, Industrialized, Rich, and Democratic). Obviously, Midjourney was able to create images, so it clearly does have some source material to draw from. However, I’d be cautious about this because the source material is more limited and potentially reflects stereotypes.

But tentatively, based on the reaction, this has possibilities for localizing content with relevant images that would have been challenging to source in stock libraries. It requires more validation by people well-versed with the culture if you’re generating images for a location or culture you’re not personally familiar with.

Wheelchairs

One ongoing challenge for finding character images for scenarios is adequate representation of disabilities. I wanted to see what would happen if I tried to create a consistent character in different poses who uses a wheelchair. Would Midjourney maintain consistency in the wheelchair?

My first problem is that Midjourney struggled to even create a realistic wheelchair for someone working in an office. It tried to merge an office chair with a wheelchair. I upscaled two of the images, but neither one was quite right.

I picked the better of the two to try creating variations. Using the –cref property in the prompt, I was able to create several alternate versions of the character. As with my earlier experiments, Midjourney does reasonably well at keeping the facial features consistent through different poses. However, if you look focus on the wheelchair, you’ll see it varies from image to image. The spokes have different colors. Sometimes the wheels are different colors. The arms of the wheelchair move, are different styles, or disappear completely. I’d also prefer to not have push handles on the back of the wheelchair (that’s more the style for a wheelchair you use temporarily, not necessarily what you’d have for a long-time wheelchair user who is working in an office). But, maybe with a little cropping and editing, I could get this to work.

Character sheet for generating the source image

One suggestion I had seen for improving results with characters was to create a character sheet with multiple poses as the initial source. Then, the idea is that the facial features may be more consistent when you use –cref. I thought that might help keep the wheelchair more consistent too.

Character sheet, Asian woman in her early 30s wearing a red blouse using an electric wheelchair, studio setting, flat background, Leica M10-R; image split into 4 different pictures, shot from multiple angles --ar 16:9
Three images of an Asian woman sitting in a wheelchair

Although I prompted for 4 images, you’ll see I got 3 instead. That’s pretty common with this character sheet approach. Midjourney is terrible at counting. It also ignored the word “electric” in the phrase “electric wheelchair.

But, I tried using the character sheet as a source image anyway. I think it did a little better keeping the non-electric wheelchair consistent. However, I don’t think her facial features are as consistent with this set. I also was unhappy with how sheer it made her blouse in every variation.

Editorial photography, Asian woman, early 30s, wearing a red blouse, electric wheelchair, in a modern office conference room, Leica M10-R, --ar 16:9 --cref https://s.mj.run/Q1vcZ62n8S4 
4 thumbnails of an Asian woman wearing a red blouse and in a wheelchair

I tried again, changing to a white blouse instead of a red blouse to try to get something opaque. This time, I lowered the character weight property, using –cw 20. That allows for more variation of outfits and hair. Unfortunately, this also created more variation in her facial features. Her blouse is white…but it’s still mostly sheer.

Editorial photography, Asian woman, early 30s, wearing a white blouse, electric wheelchair, in a modern office conference room, Leica M10-R, --ar 16:9 --cref https://s.mj.run/Q1vcZ62n8S4 --cw 20
4 thumbnails of an Asian woman wearing a white blouse and sitting in a wheelchair

Forearm crutches

While wheelchairs are one of the most common ways visible disabilities are depicted in stock images and scenarios, the reality is that a lot of different tools are used by people. Forearm crutches are another option, so I wanted to experiment with that. I was looking for something like this stock image.

Initially, I couldn’t even get any crutches to be part of the image. I shifted the image dimensions and prompted for full body images to see if I could improve the results.

A white woman in her late 20s with light brown hair, wearing a dark blue blouse, using forearm crutches. Standing next to a window in a modern office. Full body image --ar 9:16 --cref https://s.mj.run/jUw5BUgDZ7k

As you can see below, it didn’t really work. She has sort of a cane, but definitely not a forearm crutch.

A white woman in her late 20s with light brown hair, wearing a dark blue blouse, holding sort of a cane (but not a forearm crutch).

I thought maybe the window was throwing off the model, so I tried a different setting of a conference room. I at least got two canes in these results, but still zero forearm crutches. I think Midjourney just doesn’t have enough source images to understand what I meant with this specific request. Also, as with my previous experiments generating the image of a woman in a wheelchair, her blouse gets more sheer with subsequent generations. Ugh.

4 images of a woman wearing a blue blouse with varying degrees of sheerness, using two canes

Hearing aids

When you prompt for images with a hearing aid, Midjourney really, really wants to give you results with older characters, regardless of what you request. Do these results look like a man in his mid-40s to you? I tried a few variations of shirt color and race. I got a lot more gray hair than I do if I prompt for the same age without a hearing aid.

None of the “hearing aids” really look quite like hearing aids either. They look more like ear buds. Some of them have wires and microphones that make them look even more like ear buds.

Character sheet, Black man with a hearing aid, mid 40s, wearing a dark gray button down shirt, studio setting, flat background, Leica M10-R, image split into 4 different pictures, shot from multiple angles --ar 16:9

I tried changing to a woman, setting the age younger, and requesting a “behind-the-ear hearing aid” to be more specific. This was a little better, but I still got some ear buds and headphones rather than the hearing aids I was looking for. I got at least one set that seemed potentially workable though.

Character sheet, white woman with a behind-the-ear hearing aid, mid 30s, wearing a blue button down shirt, studio setting, flat background, Leica M10-R, image split into 4 different pictures, shot from multiple angles --ar 16:9

Cochlear implants

Midjourney has no idea how to generate a cochlear implant image. It either ignores the prompt or does…something else. For example, all of these images have wires going down the side of the neck rather than a wire connecting the behind-the-ear portion with a transmitter attached to the back of the skull.

4 thumbnails of a woman from the back, showing various devices on her ear. Some look like steampunk headphones or cybernetic devices. All of them have wires going down the side of the neck.

Overall, disappointing results for disability representation

I’ll keep experimenting, but overall, I was disappointed with the results of Midjourney character images with disabilities. I think some of these are potentially usable, but it would take quite a bit of effort refining prompts to get a really good set. Given the current state of the technology, I think it’s probably easier to continue to use stock images or to edit vector character images than to prompt using AI.

Hope for future development

However, AI image generation technology is improving at an amazing pace. While I don’t feel like it’s successful right now, in 6-12 months, the situation could be completely different. After all, six months ago we couldn’t generate consistent characters in different poses, and now that’s fairly easy with –cref in Midjourney.

If you’re curious to see how fast the technology is improving, this article shows the evolution of Midjourney from version 1 to version 6–just 18 months of difference. It’s truly amazing to see how far we’ve already come. That gives me hope that we’ll have better results in the future.

Practitioner, not AI expert

As a reminder: I’m not an expert in AI. In this blog post, and everything else I do, I approach this as a practitioner using the tools, rather than an expert in AI overall. I know there are plenty of people out there claiming to be overnight experts in AI, and that’s definitely not me. But I do think it’s important for us in L&D to at least experiment with these tools and see what’s possible.

Your results?

If you test out Midjourney or any other tools for generating images (especially images of characters using assistive devices), let me know about your results.

Leave a Reply