Q: What does joining the network involve?
AND: Being part of a network means that you can be contacted about the possibility of testing a new model or testing an area of interest on a model that has already been implemented. Work conducted as part of the network is conducted under a non-disclosure agreement (NDA), although we have historically published many of our red team findings in system tabs and blog posts. You will be compensated for time spent on red team projects.
Q: What is the expected time commitment to participate in the network?
AND: The time you decide to dedicate can be adjusted depending on your schedule. Note that not everyone in the network will be contacted for every opportunity, OpenAI will make selections based on the right fit for a particular red pooling project and highlight new perspectives in subsequent red pooling campaigns. Even just 5 hours in one year would still be valuable to us, so don't hesitate to apply if you are interested but your time is limited.
Q: When will applicants be notified of acceptance?
AND: OpenAI will continuously select network members and you can apply until December 1, 2023. After this application period, we will re-evaluate the opening of future re-application opportunities.
Q: Does being part of the network mean I will be asked to join the red team for every new model?
AND: No, OpenAI will make a selection based on the appropriate fit for a particular red team project, and you shouldn't expect to test every new model.
Q: What are the criteria you look for in network members?
AND: Some criteria we look for are:
- Demonstrated expertise or experience in a specific domain relevant to the red team
- Passionate about improving AI security
- There is no conflict of interest
- Different origins and traditionally underrepresented groups
- Diverse geographic representation
- Fluent in more than one language
- Technical ability (not required)
Q: What are other security collaboration opportunities?
AND: In addition to joining the network, there are other collaborative opportunities to contribute to AI security. For example, one option is to create or conduct security assessments on AI systems and analyze the results.
OpenAI's open-source Evals repository (published as part of Launch of GPT-4) provides easy-to-use templates and sample methods to start this process.
Evaluations can vary from simple question and answer tests to more complex simulations. As concrete examples, here are examples of assessments developed by OpenAI to evaluate AI behavior from a number of angles:
Reassurance
- MakeMeSay: How well can an AI system trick another AI system into saying a secret word?
- MakeMePay: How well can an AI system convince another AI system to donate money?
- Ballot proposal: How well can an AI system influence another AI system's support for policy proposals?
Steganography (hidden messages)
- Steganography: How well can an AI system pass secret messages without being intercepted by another AI system?
- Text compression: How well can an AI system compress and decompress messages, to enable the hiding of secret messages?
- Schelling Point: How well can an artificial intelligence system coordinate with another artificial intelligence system, without direct communication?
We encourage creativity and experimentation in evaluating AI systems. When you're done, we invite you to contribute your evaluation to the open source Evals repo for use by the wider AI community.
You can also register with us Researcher Access Programwhich provides credits to support researchers who use our products to study areas related to the responsible implementation of artificial intelligence and the reduction of associated risks.