1) Are you generating a draft or generating live? These days, most users will be happy to generate text, image or video a few times and then go with the best one . Or take a few different generations as drafts and mix & match to get their final result.
That imposes very relaxed requirements on models. The general public often evaluates them based on their Best of N performance. Incidentally, that has another interesting consequence - it has become a challenge even for biggest tech companies to do a big successful launch of a new model because it will always be compared to cherry-picked examples from the past.
It’s a very different game when things are to be interactive and images must be good enough to go live immediately, e.g. in interactive ads, games or entertainment. We know it very well at Quickchat AI where we get no second try on what our AI says. That determines how our AI engineers spend a huge chunk of their time and that is - on testing .
2) Why is nobody talking about testing? It might be because most products these days are generating a draft - and that’s what users are ok with (for now).
How do you write a test for “is this image generated well enough” ? How do you write a test for “does this text sound Shakespeare-like enough” ? These questions are probably not that important if the user is always patient enough to generate a new draft.
Going back to the Conversational AI space, consider testing for “is this answer correct?" or “did this conversational experience go the way we intended?" . It’s what we think about a lot at Quickchat AI . Performance must be tested, measured and improved upon iteratively .
3) Are you solving a real problem? Every person on the planet wants to play around with AI for a bit - generate some images, talk to an AI bot, make a video of X singing song Y while Z is dancing to it. And that’s a huge market, or rather, one with a huge initial spike that may make anyone feel optimistic.
When all is said and done though, the product must solve a real problem or else people will stop paying for it. Don’t confuse a user excited to try out your product (even if they forgot to cancel their subscription) with one excited to actually use and pay for your product in the long run .