(This will be long brain dump and it will help remind me of what I know and do not know.. which is kinda why I started this thread.. push me to learn more..)
Well, when you first heard about AI image generation you heard about MidJourney.. which is a website service that you use in a browser. For that you pay a monthly fee that varies according to how much you plan to use it but you can start fairly cheaply. it will use a server farm's GPUs (video card-like graphic processing units, undoubtedly Nvidia) to do the processing. MidJourney uses Discord as its interface.
Stable Diffusion is an open-source system that you can install on your own PC if it is powerful enough. That means it is free to use because you are using your own PC hardware and electricity. I have a decent PC and video card (16GB Nvidia 4700). There is even a way to use google's cloud computing to install stable diffusion and use a browser to generate images.
Stable Diffusion XL is like the 3rd generation of the system that I am aware of. There was a 1.5, 2.1 and now XL version that you can see being offered for install.
There are also many different open source add-ons and interfaces to Stable Diffusion that make it "easier" to get good results. I put easier in quotes because the whole thing can get very complex and, quite frankly, I have only scratched the service and it is still "black box" to me. I really don't have the kind of control over this beast where I feel comfortable knowing if I ask for "this" that I will get anything what I think of as "this".
Youtube has so, so many videos on how to use it and they all make it look easy... I am nowhere near "there" yet.
Probably the most common combination of things I see being used is Stable Diffusion plus "Automatic 11.11" or "ComfyUI"... both of which are user interfaces to stable diffusion. I am using a ComfyUI variant called StableSwarm which adds the complexity/ability to use multiple GPUs on multiple PCs on your network, though I still use it on a single PC.
There are several youtube channels I'd recommend.. and later I will pick out specific videos at the bottom of this message as "getting started" video how-tos on installing..
Matt Wolfe - general AI.. not sure this guy sleeps at all.. so many videos and a website/newsletter keeping track of all his AI product research called
futuretools.io
Sebastian Kampf - this guy starts every video with a dad joke.
Olivio Sarikas - This guy has a lot of Stable Diffusion and Automatic 11.11 (user interface) videos..
Scott Detweiler - This is the ComfyUI guy (powerful node-based user interface to Stable Diffusion) Scott actually works at Stability AI.. who make Stable Diffusion..
Alek Sheffy - this guy uses AI tools to start online businesses.. mostly tallks you through various processes to use AI to generate art, stick it on a print-on-demand white-label product like mugs or tee shirts, etc.. and use an Etsy shop to sell it without making it or shipping it out yourself. (there is lot of competition and more coming every day.. I think the make X-thousand $$$ per day titles are miracle best-case examples and not what anyone should expect)
When searching for this stuff I often use filters to keep it to uploaded in the last month or even week then sort by view count to make sure I see YouTubers that people choose to follow... though it is easy to miss good content by newer YouTubers with few views...
Back specifically to Stable Diffusion. If I were to start again today... I might.. I would look for a noob how-to for Stable Diffusion XL and either Automatic 11.11 or ComfyUI. Probably Automatic 11.11 first.. then ComfyUI for more control using a "node system"...
..a node system is using flow chart nodes to connect inputs and outputs to define a flow to follow in processing. A very simple textual representation of that might be..
Prompt to generate an image --> image resize upscale
Originally StableDiffusion was limited to images at 512x512 pixels. That's small. XL can do 1024x1024 which is better... but what if you were generating something to use in a 4K video.. you need it larger. Same for many print formats.
There are also more complex process flows.. like something called "inpainting" which would be part of an "image to image" process flow. That is, start with an image and change only this area... but use AI to make sure the new stuff you put in that area matches the prompt and looks like it belongs rather than just got stuck on there like a sticker.
It all can get extremely complex... and as I said earlier, it is easy to lose your way.
I am helping someone possibly make a children's book by generating images and one of the challenges involved is to keep the artwork consistent... style is easy to make consistent.. characters.. not so much. Imagine you had a little girl character who goes on an adventure... you'd want that little girl to look the same in each scene on each page in the book... hard to do. Plenty of videos showing you how to do "consistent characters".. still hard to make it happen mostly because you need to start with a series of images that show a model character from every angle. The more images you start with the better you can create consistent images.
Get this.. if I were trying to make some anime-inspired fake influencer babe dressed in NSFW ways, there are plenty of examples of how to do that consistently.. even what are called LORA models already out there to use as a basis where you can basically say, generate a girl LIKE this LORA model.
LORA is an anagram of something, I forget what... another keyword connected to this is ControlNet, which I still have not investigated but I think that is the collection of images that help guide image generation.. think of shots showing front, left and right profile... that's a 3-image "control net".. AI wants more.. 9+.. saw someone mention using over 50 images...
So in that children's book for little girls, I tried various prompts using "5-year-old Scarlet Johansen" and it kinda worked because Stable Diffusion was trained on a lot of photos.. and there are lot of photos of famous people out there.
In an effort to make an image for this thread I tried to get "Rutgers Football QB Gavin Wimsatt riding a unicorn on the Rutgers Stadium Football Field" with some very weird results. Sometimes it came close.. Sometimes it had even the right skin tone and number... that reminds me.. there are different "models" to use in stable diffusion.. which roughly map to how it was trained..
All this AI stuff.. ChatGPT and every other AI thing seems to be based on something called a "Large Language Model".. LLM.. which basically maps to what information the AI brain knows about. What it was taught... even what it has access to. Most do not look stuff up LIVE.. it had to have already "seen" something like what you ask for. The word "seen" there means "been trained on".
Think of the famous IBM Watson playing Jeopardy (and beating Jeopardy masters).. none of them could look stuff up live during the show... even Watson.. Watson would have had to have been trained on the basics facts involved already.. had it in its database.. not he exact question and answer.. but the facts that could be compared and evaluated to arrive at the correct answer. It is the same for all this AI art generation.
Now, I should ask ChatGPT to copy-edit this long post.. but I won't.
So, getting started like videos...
recent MidJourney how-to seems decent... there are older how
Then
this channel has a lot of MidJourney specific advice, how-tos, prompts, etc.. remember, MidJourney is for those who don't have the horsepower PC and video card at home.. also, you can always as ChatGPT to write prompts for you.. that Matt Wolfe guy I linked above often has videos showing him asking ChatGPT to write him prompts for other AI uses.
There will be pre-reqs for install but these videos should talk you through it.. like Python and Github... and all of these might talk about using Command.exe on windows or provide batch files to run. You will also have to be comfortable with downloading files to specific folder locations. But you can learn all this stuff the same way.. google it.
Stable DIffusion XL install how-to... (you could see if any of this is useful to you before adding other user interfaces like Automatic 11.11 and/or ComfyUI)
Stable Diffusion XL plus Automatic 1111... I recommend this based on how many people use Automatic 11.11)
This one installs Automatic1111 and the 2 main Stable Diffusion models.. fairly clear process..
This video is why I chose to try ComfyUI before the more popular Automatic 11.11.. the node-based system was very similar to what is found in the hottest video editor out there now, Davinci Resolve. I had made some videos recently and this node-based thing was familiar to me because of that. In Davinci you connect inputs and outputs from various special effects and image filters and adjustments and so on.. to layer videos over videos and so on...
Good luck...