Prompt Part 1

What is a Prompt?

Prompts are foundational to Generative AI, allowing the creation of varied digital outputs from simple instructions. This guide progresses you from a beginner to a master in prompt-crafting. Prompts, akin to whispers to an AI 'artist', initiate a dynamic interaction where the AI materializes ideas into digital creations. The emerging role of a Prompt Engineer embodies this art, acting as a mediator between human ideas and AI interpretation. Prompting, while seemingly straightforward, involves depth and skill akin to mastering a musical instrument. Effective prompting transcends basic input, requiring deep understanding and practice to fully harness AI's capabilities.

Understanding the Significance of Prompts in Generative AI

In the exciting world of Generative AI, there's this amazing thing called a "prompt." Think of it as a simple instruction that can make AI create all sorts of cool stuff like images, music, text, and even videos. It's like giving an artist a hint, but in this case, the artist is a computer, and the hint is just some words we write down.

When you use a prompt, you're having a conversation with the AI. You say something, and the AI responds with a digital masterpiece, a piece of writing, or a catchy tune. It's a bit like magic, blending precision with unexpected surprises.

And here's something cool – there's a job called a "Prompt Engineer." These folks are like the wizards who understand both human ideas and how AI thinks. They use this knowledge to write prompts that guide the AI to create exactly what you want.

Now, while using prompts might seem easy, there's a catch. Just like playing a musical instrument, it takes practice to become an expert. Anyone can make AI do something with a simple prompt, but becoming a pro at it, like a guitar legend, requires a deeper understanding of how AI works.

To master prompt engineering, you need to dive into AI's secrets and practice—a bit like learning to play music with style. It's a journey that lets you unleash the full potential of Generative AI.

 

What Makes a Great Prompt?

First off, it's essential to grasp that prompts should fit the AI platform you're using. Different AI models have their specialties, which strongly affect how they understand and act upon instructions. Think of it like this: a text-to-image system, such as Stable Diffusion, and a text-based model like ChatGPT, have their own unique ways of doing things. They come with their own architectures, biases, and knowledge bases, almost like their own languages. So, our prompts need to speak their language for the best results.

Here's the fun part about prompting: it's a bit like cooking. Just as every chef has their style, techniques, and secret ingredients, prompters have their own methods and preferences. When you chat with other prompters, you'll discover that some swear by one approach, while others have a completely different take. And that's perfectly fine!

Our goal with this guide isn't to lay down strict rules or claim there's a one-size-fits-all way to prompt AI. Instead, think of it as your toolkit – a collection of handy tips, insights, and strategies to kickstart your journey in prompt crafting.

The idea is to give you a foundation and a set of techniques that you can adapt and refine to match your style. Keep in mind, prompting is part art (despite what some AI critics might say!) and part science. The beauty of art lies in its diversity and personal touch, and the same goes for crafting prompts.

 

Stable Diffusion

This guide starts by diving into the art of prompting for Stable Diffusion (txt2img), which is a field we specialize in here at Civitai. However, as time goes on, we'll expand our horizons to cover tactics for other Generative AI technologies as well.

 

Important Considerations

When it comes to prompting for Stable Diffusion, it's vital to remember that while the overall structure and format often remain consistent across models, the specific words or tokens you use can lead to a wide range of outcomes. What's effective for one model might not yield the same results with another!

Furthermore, it's important to note that prompting for SD 1.4/1.5 differs significantly from prompting for models using the SDXL architecture. While there's some overlap, and an SD 1.5 prompt might work reasonably well with an SDXL model, it's often best to tailor your prompt to the specific framework for optimal results.

 

Getting Started with Prompting

As we mentioned earlier, there are various ways to prompt, and there's no "wrong" way as long as we're satisfied with the results. However, structuring our prompts can be quite helpful. It allows us to maintain consistency and develop good "prompting habits," making our prompts easier to understand.

In the sections that follow, we'll cover the basics you need to kickstart your prompting journey effectively!

 

The Main Focus

The subject is like the star of our image, the central object we want to show.

 

       

     

Subject description: 1girl, woman, petite

 

The Tools

The medium is all about the tools an artist uses, like oil paint, watercolors, charcoal, or pencils. In the same way, we can tell Stable Diffusion to create art in a specific medium by mentioning it in our prompt. Some models are designed specifically for certain mediums, so they might not need extra instructions to get the right look.

 

     

Medium: watercolor painting

 

Expressing Artistic Style

Style is like the unique artistic flavor of our image. Think of styles like impressionism, realism, pop-art, surrealism, and more. Just like with the medium, some models are tailored to produce a specific style, so you might not need extra instructions to achieve the style you want.

 

     

Style: impressionist background

 

Arranging the Art

Composition is all about how we put together the different parts of the image to make it look just right. It involves things like arranging objects to create balance and symmetry, framing the image, making sure the sizes match, and other artistic tricks to bring our vision to life.

 

     

Composition: from above

 

Playing with Color and Light

We can have fun with both color and lighting in our images. This means we can change the colors of things in our scene or adjust the overall color tone.

Lighting is also a big deal in art, and we get to control a lot of it. We can tweak things like shadows, how bright the image is, and how vibrant it looks.

 

Color: rainbow hue

     

Lighting: bright

 

Creating an Effective Prompt – How to Do It Right

As we mentioned earlier, it's a good idea to follow a somewhat standard structure when crafting prompts. When we begin crafting our own prompts, we'll discover what works best for us and how we prefer to organize them.

Here's one way to structure a prompt. This not only makes the prompt look neat and easy to read but also has practical benefits. Keeping similar tokens grouped together increases the chances that they'll be included in the final output. For example, if we want to describe someone with "red hair" and a "long nose," it's better to have them in a "block" of tokens rather than scattering them throughout the prompt. This increases the chances that both details will be considered.

Let's break down our prompt into three sections:

The First Section – Subject & Setting

In the first section, we describe the subject and their appearance, like "1girl, woman, petite," and so on. We also mention the setting, where they are, and the mood we want, like "in a park, sky, trees, moonlight, stars."

The Second Section – Color, Style, and Lighting

In the second part of the prompt, we talk about colors, style, and lighting. This is where we define the visual elements we want, such as "vivid colors, bokeh background, dramatic color, cartoon."

The Third Section – Composition & Extra Details

The final section of the prompt focuses on the composition and any additional details we want to set the atmosphere. We can mention things like the angle, mood, and any other specific elements we'd like to see, such as “from below, cinematic, whimsical.”

 

     

1girl, woman, petite, pale skin, detailed face, bobcut hair, blue eyes, wearing yellow tank top, happy, laugh, statement sunglasses in a park, sky, trees, moonlight, stars, vivid colors, bokeh background, dramatic color, cartoon, from below, cinematic, whimsical

 

Avoiding Unwanted Elements

Negative prompts help us specify what we don't want to appear in our images. Sometimes, mentioning what we don't want in the positive prompt can have unintended results. To address this, we use a separate Negative Prompt section to exclude any undesired features from our images.

 

     

Common No-Nos

In many prompts, you'll find a "common" or "universal" negative. This includes a list of things we generally don't want to see in our images. An example might look like this:

low res, bad hands, text, error, missing fingers, extra digit, fewer digits, 

cropped, worst quality, low quality, normal quality, jpeg artifacts, 

signature, watermark, username, blurry, ugly

 

 

Words vs. Tokens – How We Talk to AI

Stable Diffusion doesn't understand words the way humans do. Instead, it relies on a "tokenizer" (for SD 1.5 CLIP) to change our word prompts into "tokens," which are numerical representations found in its "vocabulary."

Most common words are represented by a single token, but some longer or more complex words might get broken down into separate tokens, each with its own meaning. And some really obscure or unusual words might not be recognized at all, leading to unexpected results in the prompt.

Let's take a look at an example:

     

The interesting thing here is that the word "whiskers" had to be split into two tokens (6024, 2880) for Stable Diffusion to understand it. But don't worry, Stable Diffusion still recognizes many multi-token words!

 

Conclusion:

In this guide, we've covered the absolute basics of creating prompts, but there's much more to explore! In Part 2, we'll delve into prompt weight and emphasis, and in the final part, we'll tackle advanced topics.