When you type your prompt into Midjourney it breaks down the prompt into tokens or a series of ideas. For example, the prompt "a photorealistic portrait of a cat" would be interpreted as "photorealistic", "portrait", and "cat."
Then it gets transformed into a mathematical representation and fed into a machine learning model that creates the image.
Midjourney tends to place more emphasis on the tokens at the beginning of a prompt. The longer the prompt, the less emphasis will be placed on each token. Starting with a simple 3-5 word prompt (or less) is a great way to build a foundation for your final prompt.
Nick St. Pierre has created a technique to structure your prompt that he calls "Additive Prompting". The idea is that you start with your basic idea "photorealistic portrait of a cat", then slowly add more tokens or details once Midjourney has given you an output resembling what you expect to see. Below is the order of the separate parts or types of tokens that can be used to transform your prompt from simple to advanced:
Before we hop into the tutorial below, it's important to note that this technique is not the only way to structure your prompt. Sometimes the order doesn't matter and you get the result you want, but this structure serves as a foundation and you can rest assured that using this technique will get you very close to the outcome you're looking for.
We're big fans of The Joker and all things Batman, so we decided to start with a basic prompt of Joaquin Phoenix and transform him into The Joker without using "The Joker" in the prompt. Using a specific artist or aesthetic gets you close to the endpoint faster, but Additive Prompting gives you much more control over each element in the piece. Let's take a look at how this works.
Our starting point is a basic prompt that gives us a street style photo of Joaquin Phoenix. We wanted to start as simple as possible, almost as if we are taking him from the street to the movie set where we then start applying all the details that make him The Joker.
Next, we apply "medium shot" to the prompt because we want to see him from the waist up. Here are a few other shots we could have used.
Then we apply "film still" to the prompt so that the image has a more cinematic feel to it.
Next, we add "Kodak Gold" to the prompt to give it a more vintage look. Kodak Gold is great for portraits because it adds warm color and has medium contrast properties.
Then, we add "walking" to the prompt so that there is a sense of motion. You could use "dancing" to get an output that resembles the movie trailer scenes. Scroll all the way to the last step to see our final version where we used "dancing" and added "--ar 16:9" to give the image a more cinematic composition.
Now comes the fun part. Step 6 is two-part step because we applied the styling slowly. For this first part we simply added "red wool suit".
Then, we added "dark green slicked-back hair, clown makeup" to the prompt. We could have applied those separately but we got excited.
In this step we added "1970s new york city". The environment doesn't change much, but his suit definitely does! The step above looks much more modern.
This is a subtle step where we added "overcast" and the image became a bit more gray. The suit is much less vibrant.
Next, we add "foggy" to the prompt. This creates a much more dramatic scene and the makeup is looking better somehow.
Finally, we add "depressing", replaced "walking" with "dancing" and added "--ar 16:9" so that we could try and replicate a scene from the movie trailer. We're really happy with how it turned out.
What do you think?