Sep 22, 2023·edited Sep 22, 2023Liked by Bryan Alexander
Bit of a breakthrough here...
"“DALL·E 3 can accurately represent a scene with specific objects and the relationships between them.”
Neither Midjourney nor Stable Diffusion allow you to do this—solitary characters and objects are easy and the quality is high, but scenes where different objects have to follow specific relationships described in the prompt? That was an unsolved challenge.
Sam Altman predicted a while ago that prompt engineering was a temporary phase of generative AI. I agreed back then but argued that it could take a lot of time to get the models to the point where we wouldn't need to translate our ideas into a language they could understand. It seems that milestone, at least for image generation models, has been achieved.
This means that the entry barriers that somewhat “gatekept” the ability to create amazing images with AI are being demolished fast. Visual creativity is being democratized."
Natural description with rich imagination is supplanting coded prompting constrained by machine semiotics. As a perennial philosopher who has witnessed the weaponization of the fields of ethics and evolution, I am watching promising developments in the Open Source community that will promote genuine inclusion and freedom of thought and expression. Perhaps this is a step in that direction?
I am a Tintin fanatic and so the fact that it picked Waterloo (?) may have to do with Brussels (where Hergé was from)? Also, yes, the bottom images reflect Hergé's "ligne claire" signature drawing style - clear black lines, not too worried about realism or perspective for the characters themselves.
Bit of a breakthrough here...
"“DALL·E 3 can accurately represent a scene with specific objects and the relationships between them.”
Neither Midjourney nor Stable Diffusion allow you to do this—solitary characters and objects are easy and the quality is high, but scenes where different objects have to follow specific relationships described in the prompt? That was an unsolved challenge.
Sam Altman predicted a while ago that prompt engineering was a temporary phase of generative AI. I agreed back then but argued that it could take a lot of time to get the models to the point where we wouldn't need to translate our ideas into a language they could understand. It seems that milestone, at least for image generation models, has been achieved.
This means that the entry barriers that somewhat “gatekept” the ability to create amazing images with AI are being demolished fast. Visual creativity is being democratized."
https://thealgorithmicbridge.substack.com/p/openai-has-just-killed-prompt-engineering
Yes, I'm following it, and itching to get my hands on it.
Solving letters would be a big deal, as that's a serious issue with DALL-E.
I'm not sure, though, how this changes prompt engineering. Just more of it, no?
Natural description with rich imagination is supplanting coded prompting constrained by machine semiotics. As a perennial philosopher who has witnessed the weaponization of the fields of ethics and evolution, I am watching promising developments in the Open Source community that will promote genuine inclusion and freedom of thought and expression. Perhaps this is a step in that direction?
If so, we should expect hostility to open source AI.
I am a Tintin fanatic and so the fact that it picked Waterloo (?) may have to do with Brussels (where Hergé was from)? Also, yes, the bottom images reflect Hergé's "ligne claire" signature drawing style - clear black lines, not too worried about realism or perspective for the characters themselves.
Interesting to think of the Brussels connection.
Not a bad job, then?
This was amazing! Thanks Bryan.
You're welcome, Mary. I don't know if you can tell that I had a lot of fun doing it.
I hope Hugging Face brings it back!