AI Safety and Governance
Summary
Artificial Intelligence (AI) technologies are advancing at a rapid rate and already outpacing humans in many areas including writing code, predicting protein folding, understanding humour, and passing the bar exam. AI has the potential to serve society and significantly improve living conditions. Like any other powerful technology however, it also comes with inherent risks, some of which are potentially catastrophic.
As AI technologies are likely to get even more powerful in the near future and integrate further into our lives, it becomes even more important to acknowledge and mitigate these risks. Many AI experts, academics, policymakers, and public figures are concerned about the most severe risks of AI, including extinction. By prioritising the development of safe and responsible AI practices, we can fully harness the potential of this technology for the benefit of humanity.
Engineers may be able to contribute effectively to reducing the risk from transformative AI by switching to technical AI safety research, bringing their technical experience to AI governance, or participating in advocacy and field-building.
This Article is a Stub Page
If you’re looking for a way to contribute to HI-Eng, can spare 3 hours a week, and would like to try out some generalist research by fleshing out this article, please contact us!
How to Use This Page
The first link in each section provides a general overview. You can read the following links to dive deeper.
Published: 12 Oct 2023 by Jessica Wen
Cause area overview
What is AI?
Introduction to Artificial Intelligence (AI) and Machine Learning (ML)
Current capabilities and risks
Viewing suggestion
Future capabilities and risks
Capabilities of AI in the future
Current trends show rapid progress in the capabilities of ML systems – 80,000 Hours
When will the first weakly general AI system be devised, tested, and publicly announced? – Metaculus
Risks of Advanced/Transformative/General AI
Viewing suggestion
How can engineers do impactful work in AI Safety and Governance?
What are the bottlenecks?
Technical AI Safety
One of the most promising ways to reduce the risks from AI systems is to work on technical AI safety. Potentially the most high-impact work in this field focuses on making sure that AI systems are aligned, i.e. that they do what we want them to do, aren’t power-seeking, are robust against misuse, and do not cause accidental or intentional harm. Read more on 80,000 Hours’ career review of AI Safety technical research.
The types of work you can do in technical AI safety include (taken from AI Safety Fundamentals’ career guide in technical AI safety):
Research: Advancing research agendas aimed at developing algorithms and training techniques to create AI and ML systems that align with human intentions, offer improved controllability, or enhance transparency.
Engineering: Implementing alignment strategies within real systems necessitates the collaboration of ML engineers, software engineers, infrastructure engineers, and other technical professionals typically found in modern technology companies.
Support Roles: This includes activities like recruitment and organizational operations at rapidly growing entities focused on building aligned artificial intelligence. Note that our guide does not currently cover advice for these roles.
Other Opportunities: As the field expands, alignment roles are likely to resemble positions within the broader tech ecosystem, such as product development, design, and technical management. Additionally, there may be other opportunities not yet considered in this list.
AI Policy and Strategy
To reduce risks from powerful AI, technical work needs to be complemented by strategy work to answer more human-centred questions, such as how to avoid an arms race to develop powerful AI, how to distribute the benefits widely and fairly, etc. Some good introductory resources are below:
If you’re interested in the AI industry and cloud providers’ views on AI policy (they are very vocal about them!), here are some places to start:
Microsoft’s President Brad Smith wrote about How do we best govern AI?
Anthropic has several posts, including their Responsible Scaling Policy which also points to other ways technical people can contribute, e.g. if you’re an expert in chemical, biological, radiological, or nuclear risks, you would make an excellent red teamer.
OpenAI has created a Preparedness team to work on similar things, but as of this post, they haven’t published very much yet.
Google DeepMind Responsibility and Safety Team has put out some great work in the past.
The newsletter Learning From Examples from one of their team members does a great job of taming the news flood.
There are many think tanks who are focusing on AI policy and strategy, and some produce excellent work. Greg Allen’s work from the Center for Strategic and International Studies, especially his writing on export controls, comes highly recommended by Chris Phenicie. To keep up to date with research from other think tanks, you can follow newsletters from places like:
Center for Security and Emerging Technology especially their policy.ai newsletter
Ben Buchanan, and the work he did at CSET on describing the AI Triad
RAND (e.g. their work on AI)
Center for a New American Security (e.g. their work on AI)
Institute for Progress (e.g., their work on emerging technology)
If you’re interested in how these views and information get communicated to politicians and what approaches are successful, watching congressional hearings can be very useful. In particular, watch the “Oversight of AI” hearings that led to the “Blumenthal-Hawley” framework for governing AI. These hearings include:
Some other interesting hearings include:
If you are interested in understanding China’s progress and approach to AI, the following resources are useful:
ChinaTalk for a focus on China, which is increasingly about AI and occasionally has very technical guests
Subscribe to the newsletter ChinAI
If you’re more interested in some of the more immediate problems and solutions, AI Snakeoil is a great source of information on the capabilities and limitations of AI.
Compute Governance
Although compute governance is a sub-category of AI policy and governance, we think that it is a very promising way for engineers to use their technical knowledge to contribute to AI governance and make sure AI development goes well. As a result, we have separated it out into its own section.
For an overview of compute governance, the following resources are excellent:
Engineered for Impact: AI Governance from an engineering perspective
Paying attention to the state of the art in AI chips and data centres, and being able to explain these to laypeople, can put you in a good position to work on compute governance. Some places to keep up to date include:
The three people above have a podcast called Transistor Radio
And the podcast The Circuit
SemiWiki including the SemiWiki podcast
ImportAI is more about software, but I think it’s presented in a way that it’s more clear how this software news relates to AI policy
The Open Compute Project is also a great resource and you can even join their calls.
A good series to sit in on is the security calls. Knowing a lot about hardware security seems to be impactful and potentially has great career prospects even if AI X-risk ends up being unimportant.
One project that spun out of this is Caliptra, which seems like a great project to keep tabs on and contribute to.
For some ideas of how to apply technical concepts to governance, the following reports and papers are very useful:
“An assessment of data center infrastructure’s role in AI governance” by Konstantin Pilz
If you really want to dig into some government text with a technical angle, there’s the recent AI Executive Order as well as the export controls on chips and chip making equipment (though don’t be discouraged if these are hard to read or seem boring!)
Career moves
General: figuring out where you could fit in the AI safety ecosystem
You can explore all the organisations involved in preventing AI existential safety at AI Safety World
80,000 Hours career advising
AI Safety Quest has 1-1 advising and other resources to help you navigate the AI safety ecosystem
Book a call with AI Risk Discussions
AI Safety Ideas: a database of ideas that could contribute to alignment
Working in AI Alignment Career Guide – AI Safety Fundamentals
Technical AI alignment research career flow chart (starts with a CS/math undergrad, but could be applicable for you!)
Upskilling into ML and AI
Particularly suitable for any engineer with some experience in software engineering and/or ML, and generally a good skill to have!
BlueDot Impact’s AI Safety Fundamentals
3blue1brown’s linear algebra YouTube playlist
Coursera’s free Machine Learning course with Andrew Ng
Prerequisites: Week 1-6 of MIT 6.036, followed by Lectures 1-13 of the University of Michigan’s EECS498 or Week 1-6 and 11-12 of NYU’s Deep Learning
Database of training programs, courses, conferences, and other events for AI existential safety
Alignment Research
Particularly suitable for engineers who are good at conducting and driving their own research. You can build up your background in ML using the resources above.
Resources that new alignment researchers should know about – EA Forum
Develop your own views on alignment.
Here are some exercises you can do to try to develop your own views.
Write up your thoughts and share them with others e.g. on the Alignment Forum, on Medium, on your blog, etc.
See more advice on Rohin Shah’s FAQ for AI Alignment Researchers
Advocacy and Field-building
You can contribute to advocacy and field-building without much technical knowledge, and to varying degrees of commitment. General communication skills could be useful here.
Participate in an existing AI Safety group
Start your own AI Safety discussion group or reading group!
Help with Alignment Ecosystem Development
AGI safety field-building projects I’d like to see – EA Forum
Enter journalism through the Tarbell Fellowship
Work with the Existential Risk Observatory
Join Pause AI, the Campaign for AI Safety, the Global AI Moratorium, Stop AGI, and/or SaferAI
Positioning yourself as a policy expert
We wrote a post about 5 engineering skillsets that are valuable in policy. You might find that you could be a good fit!
Emergingtechpolicy.org, a website with expert advice and resources on US emerging tech policy careers, including AI policy
Keeping up-to-date with the latest advancements, e.g. through Center for AI Safety’s ML Safety Newsletter
Go into emerging technologies policy
International relations, especially China-Western relations, and foreign policy
Becoming an expert in AI hardware
Particularly suitable for electrical engineers, materials scientists, embedded systems engineers, and other types of engineers who work with semiconductor chips and in the semiconductor supply chain.
What does it mean to become an expert in AI Hardware? – EA Forum
Advice and resources for getting into technical AI governance – Google doc
Follow the SemiAnalysis newsletter for updates in the semiconductor industry
Organisations doing compute governance:
Centre for the Governance of AI (GovAI)
Rethink Priorities’ AI Governance and Strategy team: Compute Governance workstream
RAND (in particular the Technology and Security Policy Fellows)
Be wary if you join chip designers like Nvidia, semiconductor firms, or cloud providers. These could be useful for developing career capital in the form of skills, connections, and credentials, but you could also unintentionally speed up AI progress (see the risks, pitfalls, and things to keep in mind section).
Risks, pitfalls, and things to keep in mind
Accelerating AI capabilities without advances in safety could be very bad
You might advance AI capabilities, which could be (really) harmful – 80,000 Hours
Pausing AI developments isn’t enough. We need to shut it all down – Time Magazine
Mitigating downsides of a particular role
Don’t work in certain positions unless you feel awesome about the lab being a force for good. This includes some technical work, like work that improves the efficiency of training very large models, whether via architectural improvements, optimiser improvements, improved reduced-precision training, or improved hardware. We’d also guess that roles in marketing, commercialisation, and fundraising tend to contribute to hype and acceleration, and so are somewhat likely to be harmful.
Think carefully, and take action if you need to. Take the time to think carefully about the work you’re doing, and how it’ll be disclosed outside the lab. For example, will publishing your research lead to harmful hype and acceleration? Who should have access to any models that you build? Be an employee who pays attention to the actions of the company you’re working for, and speaks up when you’re unhappy or uncomfortable.
Consult others. Don’t be a unilateralist. It’s worth discussing any role in advance with others. We can give you 1-1 advice, for free. If you know anyone working in the area who’s concerned about the risks, discuss your options with them. You may be able to meet people through our community, and our advisors can also help you make connections with people who can give you more nuanced and personalised advice.
Continue to engage with the broader safety community. To reduce the chance that your opinions or values will drift just because of the people you’re socialising with, try to find a way to spend time with people who more closely share your values. For example, if you’re a researcher or engineer, you may be able to spend some of your working time with a safety-focused research group.
Be ready to switch. Avoid being in a financial or psychological situation where it’s just going to be really hard for you to switch jobs into something more exclusively focused on doing good. Instead, constantly ask yourself whether you’d be able to make that switch, and whether you’re making decisions that could make it harder to do so in the future.
Learn more
Additional resources
BlueDot Impact’s AI Safety and AI Governance courses
Victoria Krakovna’s AI Safety Resources
AI Safety Support’s Lots of Links page
EA Eindhoven’s AI safety link compilation
The AI Alignment Forum, LessWrong, and the EA Forum
Relevant organisations
AI Safety Research Groups list on AI Safety Support’s Lots of Links page