With fashionable but fake pictures of the pope being widely circulated, and photography competitions being won by synthesised entries, our CEO, Harry Keen, asks how can we differentiate between the genuine and the artificial?
You may have seen the news this week that a fully synthetic image just won a recent real photography competition. Professional human judges were fooled into believing it was real and if you look at the picture, you can’t blame them! The AI used to build this image and others like it is extremely good.
This is a fairly harmless example, but we have seen the potential harm this type of technology could cause with deepfakes. The technology has gotten even better since then with broader applicability and now it’s now much easier and cheaper to use.
This is a considerable global issue our society has never had to tackle before. We are aware of the power misinformation can have and how quickly it can spread through poorly moderated social media. Add a technology that creates lifelike but fake images, videos, speech, music and text to the mix and you’ve got a loaded gun. We don’t know the damage that gun can cause yet, but it’s not difficult to imagine scenarios that have dramatic and potentially harmful outcomes.
We’ve been in this field for over six years and seen how useful generative AI can be for businesses if used ethically. At Hazy, we create lifelike synthetic data precisely to try to represent real data as closely as possible so our customers can use the information it contains without putting real data or personal information about their customers at risk. We must continue to develop products using this technology to realise these potential benefits. However we cannot risk living in a world where we don’t know what is real and what is not.
In Hazy’s domain of structured synthetic data, our solution to this problem is watermarking. We subtly mark synthetic data so our customers can easily identify synthetic from real data records.
Enterprise businesses have a very practical reason for needing this. It could pose a significant problem if a user accidentally pushed synthetic customer data up to real production systems. Untangling that mess could be an expensive process and if used maliciously customers could be put at risk, you could fall foul to regulators and have your brand reputation permanently tarnished.
The lessons we’ve already learned should be adopted by the wider industry and society in general. We propose synthetic or fake media and information should declare itself as such, and that should be a clear standard our society holds itself to. Not doing so should be classed as deception and misinformation, and ultimately should be regulated against.
The truth is that unfortunately governments will be slower to introduce laws and regulation than the pace at which the technology develops. That leaves the responsibility in the hands of a small number of businesses developing these technologies and the people using it.
I am an optimist and it is my belief that the companies developing Generative AI want to see their products put to use to benefit society, not harm it. I also believe societal pressure can wield real influence. Deepfake technology did cause harm, but was ostracised quickly and now has precedent in courts using a blend of new and existing laws over its malicious usage. The sooner we have open, informed discussions around these more recent issues and possible solutions, the sooner businesses can use that information to agree and adopt a set of principles they can hold themselves to. Watermarking should be one of those standards.
Learn more about how we use generative AI to create our synthetic data here.