Recently, I have become interested in interactive AI such as chatGPT and image generation AI such as Midjourney and have been experimenting with them.
In this article, I would like to try image generation AI. Image generation AI uses text called prompts to give instructions for images to be generated, and I would like to try changing the prompts and see how the images change.

Environment

I will build stable-diffusion-webui(AUTOMATIC1111) in my local environment and run it. I will not explain how to build it here, but you can find various articles.
My environment is as follows. It is not comfortable because of low specs, but it still runs without much stress.

CPU：Intel Core i5-8300H CPU 2.30GHz
RAM：16 GB
GPU：NVIDIA GeForce GTX 1050 (VRAM 4GB)
OS：Windows 11 Home (22H2)

The model I used for stable-diffusion-webui is "chilled_re-generic". Since the word I used for the prompt was "girl", I thought I would get a good quality image.

Prompts and Generated Images

"a girl"

The following is a simple image of "a girl".
The outline of the image is kind of blurry.

"portrait of a cute girl"

I added "portrait of." I also added "cute" (lol)
It looks good ("cute" works?). I think some of the hand parts are out of order.

"portrait of a cute girl, ultra photorealistic, highly detailed, HDR, 8k, sharp focus"

We have added keywords such as "ultra photorealistic," "highly detailed," and "sharp focus".
I don't know which words are affecting how much, but I think it looks even better, don't you?

"portrait of a cute girl, ultra photorealistic, highly detailed, HDR, 8k, sharp focus, clear facial features, cinematic, 35mm lens, f/l.8, accent lighting, global illumination"

I added more words based on the article introducing the prompt and the site of the prompt. They are words about cameras and writing.
But it doesn't seem to change much from the last time.
It seems that the image generation AI usually gives importance to words that come first, so I wonder if these words are not very effective because they are behind.

"portrait of a cute girl, ultra photorealistic, highly detailed, HDR, 8k, sharp focus, clear facial features, cinematic, 35mm lens, f/l.8, accent lighting, global illumination, masterpiece, trending on artstation, approaching perfection"

I added more words. These are words that often appear when you look at other people's pronto." masterpiece"
, "trending on artstation", and "approaching perfection".
Again, as before, not much seems to have changed. I think it is not working because it is behind for the same reason as last time.

"portrait of a cute girl, masterpiece, trending on artstation, approaching perfection, ultra photorealistic, highly detailed, HDR, 8k, sharp focus, clear facial features, cinematic, 35mm lens, f/l.8, accent lighting, global illumination"

So then, I brought the back part from "masterpiece" right after "a cute girl".
It doesn't work so well, after all. If you think about it, "masterpiece" and "fashionable" are ambiguous.

"portrait of a cute girl, cinematic, 35mm lens, f/l.8, accent lighting, global illumination, highly detailed, HDR, 8k, sharp focus"

Finally, the description of the camera and lighting, which did not change much, was moved after "a cute girl." clear facial features" was removed (for some reason, it made the face white). Also words like "masterpiece" have been exterminated.
I can't say for sure, but it seems to reflect the lighting. It seems to lack definition, perhaps related to the quality-related words that have gone to the back of the page.

Conclusion

I thought the prompts needed some modifications rather than simple ones, but not too many would make a difference.
I will continue to experiment with different things.
When I do, I will write another article and I hope you will read it again.

Trying with Image Generation AI