We use a multi-layered security system to limit DALL·E 3's ability to generate potentially harmful images, including violent, adult or hateful content. Security checks are performed on user queries and captured images before they appear to users. We also worked with early adopters and expert teams to identify and address coverage gaps for our security systems that emerged with new model capabilities. For example, the feedback helped us identify edge cases for generating graphic content, such as sexual images, and test the model's ability to generate plausible wrong images.
As part of the work done to prepare DALL·E 3 for deployment, we have also taken steps to limit the likelihood that the model will generate content in the style of live artists, images of public figures, and improve demographic representation through the generated images. To read more about the work done to get DALL·E 3 ready for widespread use, take a look DALL·E 3 system card.
User feedback will help us continue to improve. ChatGPT users can share feedback with our research team by using the flag icon to notify us of unsafe outputs or outputs that do not accurately reflect the query you provided to ChatGPT. Listening to a diverse and broad community of users and understanding the real world is critical to the responsible development and deployment of artificial intelligence and is at the core of our mission.
We are researching and evaluating an initial version of the provenance classifier—a new internal tool that can help us determine whether an image was generated by DALL·E 3. In early internal evaluations, it is more than 99% accurate in determining whether an image was generated by DALL·E if the image was not modified. It remains more than 95% accurate when the image is subjected to common types of manipulation, such as cropping, resizing, JPEG compression, or when text or clippings from real images are overlapped with small portions of the generated image. Despite these strong internal testing results, the classifier can only tell us that the image was probably generated by DALL·E and does not yet allow us to draw any definitive conclusions. This provenance classifier could become part of a range of techniques to help people understand whether audio or visual content is generated by artificial intelligence. It's a challenge that will require collaboration across the AI value chain, including platforms that distribute content to users. We expect to learn a lot about how this tool works and where it might be most useful, and to refine our approach over time.