AI Avatars and why better (and open) solutions didn’t get viral.

Lucas Blassioli
4 min readDec 5, 2022

--

In this right moment, my Instagram feed is a mess of AI generated images of people faces, it’s a new trend that hit TikTok, Twitter, Instagram, and other major social media platforms like the rising Koo in Brazil and even the decentralized Mastodon. All thanks to Lensa, an app with filters powered by AI, that’s not something incredible new, we saw Prisma labs in the past rising and becoming a paid service, and we already had some demos of DALL-E 2 and the spread of his ‘mini’ version.

Lensa screnshot on the Brazilian version of the Apple App Store
Right now, Lensa is the most downloaded app in the “Photos and Videos” category in the Brazilian App Store.

But why, just now and right now a paid app is trending to do stylized selfies using filters that you can’t choose and edit after. Why when DALL-E is free for a time (and very affordable) and other apps are just open and you can implement in anything… why this app is getting a great public? Because of the way the selfies are created… they don’t need thinking from the user, that’s simple. The enemy of easy UI of Lensa is a text field, the same field used by all the other apps. For reliable results, you need to use your imagination and use a text field, after you train and use your images to train it. This is the start page of the DALL-E from OpenAI.

“Start with a detailed description”

I’ll be posting two examples:

The first one is a detailed sentence, the asked was “Russian Cosmonaut in space who is full of stars observing the planet earth who is a blue ball with the green continents as an impressionist paint” and the first one is simply “Russian Cosmonaut in Space as an impressionist paint”, to achieve the details and how the image works you need narrative and also what do you want to achieve, a generic sentence will generate a generic image, simple sentences can generate awesome images but not something you want, you need to improve your prompt many times, I just got the earth right when I did specified how the earth looks and how the space is. Now let’s see an image of created by the paid Lensa:

The only large and created input was fifteen selfies of me, and the app created for me one hundred images, most photorealistic using my face and with a bunch of details, nothing more than that. Upload the images, say if you are a male or female and pay. In other small detail for me is the size of the image, Prisma labs knows that people like their own image, one DALL-E image is 1024x1024, a Lensa image can be exported in 4096x4096 and used cropped as wallpaper in most modern smartphones.

Another example I can use it’s how my friend trained his own AI and to create his ‘smart avatars’, that’s the typical amount of text he needs to input, in this example, he used Robert Pattinson, and used a text AI to get inspiration.

And according to him, yet, he needed to improve the prompt to gain equivalent results with his photos.

But of course, that’s one thing in the matter, I am not even touching the surface of the topic, but we can say that is what the easy UI and what it can do with a simple input of “your photos” to an app, I’ve put DALL-E in front of my mom and some friends in the past, they don’t know what to create, they don’t know what to prompt and when trained with their photos they don’t know what to do. Before I surrender my data to Prisma labs I’ve tried to create my own AI Avatars.

The conclusion is simple: You need to trust your creativity; you need to know what you want. That’s not easy, either for me, the Russian Cosmonaut was an idea of my friend and I just worked with this example here. I think when you put all the decision making in an AI that’s when you start to see somethings, I know for sure, the first testers know what they wanted, now, we and many other people, will trust their decisions.

--

--

Lucas Blassioli
Lucas Blassioli

Written by Lucas Blassioli

I write something on the internet

No responses yet