48
6 739 818

Why Does AI Lie, and What Can We Do About It?

9:24

We Were Right! Real Inner Misalignment

11:47

Intro to AI Safety, Remastered

18:05

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

10:20

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

23:24

Quantilizers: AI That Doesn't Try Too Hard

9:54

AI Ruined My Year

How to Help: aisafety.info/questions/8TJV/How-can-I-help
www.aisafety.com/
AI Safety Talks: www.youtube.com/@aisafetytalks
There's No Rule That Says We'll Make It: ua-cam.com/video/JD_iA7imAPs/v-deo.html
The other "Killer Robot Arms Race" Elon Musk should worry about: ua-cam.com/video/7FCEiCnHcbo/v-deo.html
Rob's Reading List:
Podcast: rmrlp.libsyn.com/
UA-cam Channel: www.youtube.com/@RobMilesReadingList
The FLI Open Letter: ua-cam.com/video/3GHjhG6Vo40/v-deo.html
Yudkowsky in TIME: ua-cam.com/video/a6m7JynBp-0/v-deo.html
Ian Hogarth in the FT: ua-cam.com/video/Z8VvF82T6so/v-deo.html
Links:
The CAIS Open Letter: www.safe.ai/work/statement-on-ai-risk
The FLI Open Letter: futureoflife.org/open-letter/pause-giant-ai-experiments/
The Bletchley Declaration: www.gov.uk/government/publications/ai-safety-summit-2023-the-bletchley-declaration/the-bletchley-declaration-by-countries-attending-the-ai-safety-summit-1-2-november-2023
US Executive Order: www.whitehouse.gov/briefing-room/presidential-actions/2023/10/30/executive-order-on-the-safe-secure-and-trustworthy-development-and-use-of-artificial-intelligence/
Some analysis of the EO: thezvi.substack.com/p/on-the-executive-order
"Sparks of AGI" Paper: arxiv.org/abs/2303.12712
Yudkowsky in TIME: time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/
Hogarth in the FT: www.ft.com/content/03895dc4-a3b7-481e-95cc-336a524f2ac2
The AI Safety Institute: www.gov.uk/government/publications/ai-safety-institute-overview/introducing-the-ai-safety-institute
Responsible Scaling Policies: metr.org/blog/2023-09-26-rsp/
The EU AI Act: artificialintelligenceact.eu/the-act/
Hinton on CBS: ua-cam.com/video/qpoRO378qRY/v-deo.html
Sources:
"Sparks of AGI" Talk: ua-cam.com/video/qbIk7-JPB2c/v-deo.html
Yann LeCunn on Lex Fridman's Podcast: ua-cam.com/video/SGzMElJ11Cc/v-deo.html
White House Press Briefings: x.com/TVNewsNow/status/1663640562363252742
ua-cam.com/video/JHNkyHl5FpY/v-deo.html
King Chuck on AI: ua-cam.com/video/0_jw40Ga_mA/v-deo.html
"Equally sharing a cake between three people - Numberphile": ua-cam.com/video/kaMKInkV7Vs/v-deo.html
Community, various screenshots
The Simpsons
Sneakers (1992)
Thanks to Rational Animations for the train sequence!
www.youtube.com/@RationalAnimations
With enormous thanks to my wonderful patrons:
- Tor Barstad
- Timothy Lillicrap
- Juan Benet
- Sarah Howell
- Kieryn
- Mazianni
- Scott Worley
- Jason Hise
- Clemens Arbesser
- Francisco Tolmasky
- David Reid
- Andrew Blackledge
- Cam MacFarlane
- Olivier Coutu
- CaptObvious
- Ze Shen Chin
- ikke89
- Isaac
- Erik de Bruijn
- Jeroen De Dauw
- Ludwig Schubert
- Eric James
- Owen Campbell-Moore
- Raf Jakubanis
- Esa Koskinen
- Nathan Metzger
- Jonatan R
- Gunnar
- Laura Olds
- Paul Hobbs
- Bastiaan Cnossen
- Eric Scammell
- Alexare
- Reslav Hollós
- Jérôme Beaulieu
- Nathan Fish
- Taras Bobrovytsky
- Jeremy
- Vaskó Richárd
- Andrew Harcourt
- Chris Beacham
- Zachary Gidwitz
- Art Code Outdoors
- Abigail Novick
- Edmund Fokschaner
- DragonSheep
- Richard Newcombe
- Joshua Michel
- Richard
- ttw
- Sophia Michelle Andren
- Alan J. Etchings
- James Vera
- Stumbleboots
- Peter Lillian
- Grimrukh
- Colin Ricardo
- DN
- Mr Cats
- Robert Paul Schwin
- Roland G. McIntosh
- Benjamin Mock
- Emiliano Hodges
- Maxim Kuzmich
- Joanny Raby
- Tom Miller
- Eran Glicksman
- CheeseBerry
- Hoyskedotte
- Alexey Malafeev
- Jeff Starr
- Justin
- Liviu Macovei
- Javier Soto
- David Christal
- Jam
- Just Me
- Sebastian Zimmer
- Matt Thompson
- Xan Atkinson
- Andy
- Albert Higgins
- Alexander230
- Clay Upton
- Alex Ander
- Carolyn
- Nathan Rogowski
- David Morgan
- little Bang
- Chad M Jones
- Dmitri Afanasjev
- Christian Oehne
- Marcel Ward
- Andrew Weir
- Miłosz Wierzbicki
- Tendayi Mawushe
- Kees
- loopuleasa
- Marco Tiraboschi
- Fraser Cain
- Patrick Henderson
- Daniel Munter
- Ian
- James Fowkes
- Len
- Yuchong Li
- Diagon
- Puffjanga
- Daniel Eickhardt
- 14zRobot
- Stuart Alldritt
- DeepFriedJif
- Garrett Maring
- Stellated Hexahedron
- Jim Renney
- Edison Franklin
- Piers Calderwood
- Matt Brauer
- Mihaly Barasz
- Rajeen Nabid
- Iestyn bleasdale-shepherd
- Marek Belski
- Luke Peterson
- Eric Rogstad
- Max Chiswick
- slindenau
- Nicholas Turner
- Jannis Funk
- This person's name is too hard to pronounce
- Jon Wright
- Andrei Trifonov
- Bren Ehnebuske
- Martin Frassek
- Matthew Shinkle
- Robby Gottesman
- Ohelig
- Sarah
- Nikola Tasev
- Tapio Kortesaari
- Soroush Pour
- Boris Badinoff
- DangerCat
- Jack Phelps
- Kyle Green
- Lexi X
- John Slape
- Joel Gardner
- Christopher Creutzig
- Johann Puzik
- Pindex
- RMR
- Andrew Edstrom
www.patreon.com/robertskmiles

Відео

Why Does AI Lie, and What Can We Do About It?

9:24

Why Does AI Lie, and What Can We Do About It?

Переглядів 253 тис.Рік тому

How do we make sure language models tell the truth? The new channel!: www.youtube.com/@aisafetytalks Evan Hubinger's Talk: https:/ua-cam.com/video/OUifSs28G30/v-deo.html ACX Blog Post: astralcodexten.substack.com/p/elk-and-the-problem-of-truthful-ai With thanks to my wonderful Patrons at patreon.com/robertskmiles : - Tor Barstad - Kieryn - AxisAngles - Juan Benet - Scott Worley - Chad M Jones -...

11:47

We Were Right! Real Inner Misalignment

Переглядів 245 тис.2 роки тому

Researchers ran real versions of the thought experiments in the 'Mesa-Optimisers' videos! What they found won't shock you (if you've been paying attention) Previous videos on the subject: The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment: ua-cam.com/video/bJLcIBixGj8/v-deo.html Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...: ua-cam.com/video/IeWljQw3U...

18:05

Intro to AI Safety, Remastered

Переглядів 153 тис.3 роки тому

An introduction to AI Safety, remastered from a talk I gave at "AI and Politics" in London The second channel: ua-cam.com/channels/4qH2AHly_RSRze1bUqSSNw.html Experts' Predictions about the Future of AI: ua-cam.com/video/HOJ1NVtlnyQ/v-deo.html 9 Examples of Specification Gaming: ua-cam.com/video/nKJlF-olKmg/v-deo.html www.patreon.com/robertskmiles With thanks to my wonderful Patreon supporters:...

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

10:20

Deceptive Misaligned Mesa-Optimisers? It's More Likely Than You Think...

Переглядів 83 тис.3 роки тому

The previous video explained why it's *possible* for trained models to end up with the wrong goals, even when we specify the goals perfectly. This video explains why it's *likely*. Previous video: The OTHER AI Alignment Problem: ua-cam.com/video/bJLcIBixGj8/v-deo.html The Paper: arxiv.org/pdf/1906.01820.pdf Media Sources: End of Ze World - ua-cam.com/video/enRzYWcVyAQ/v-deo.html FlexClip News g...

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

23:24

The OTHER AI Alignment Problem: Mesa-Optimizers and Inner Alignment

Переглядів 223 тис.3 роки тому

This "Alignment" thing turns out to be even harder than we thought. # Links The Paper: arxiv.org/pdf/1906.01820.pdf Discord Waiting List Sign-Up: forms.gle/YhYgjakwQ1Lzd4tJ8 AI Safety Career Bottlenecks Survey: www.guidedtrack.com/programs/n8cydtu/run # Referenced Videos Intelligence and Stupidity - The Orthogonality Thesis: ua-cam.com/video/hEUO6pjwFOo/v-deo.html 9 Examples of Specification Ga...

Quantilizers: AI That Doesn't Try Too Hard

9:54

Quantilizers: AI That Doesn't Try Too Hard

Переглядів 84 тис.3 роки тому

How do you get an AI system that does better than a human could, without doing anything a human wouldn't? A follow-up to "Maximizers and Satisficers": ua-cam.com/video/Ao4jwLwT36M/v-deo.html The Paper: intelligence.org/files/QuantilizersSaferAlternative.pdf More about this area of research: www.alignmentforum.org/tag/mild-optimization With thanks to my excellent Patreon supporters: www.patreon....

Sharing the Benefits of AI: The Windfall Clause

11:44

Sharing the Benefits of AI: The Windfall Clause

Переглядів 79 тис.3 роки тому

AI might create enormous amounts of wealth, but how is it going to be distributed? The Paper: www.fhi.ox.ac.uk/wp-content/uploads/Windfall-Clause-Report.pdf The Post: www.fhi.ox.ac.uk/windfallclause/ With thanks to my excellent Patreon supporters: www.patreon.com/robertskmiles Gladamas Scott Worley JJ Hepboin Pedro A Ortega Said Polat Chris Canal Jake Ehrlich Kellen lask Francisco Tolmasky Mich...

16:29

10 Reasons to Ignore AI Safety

Переглядів 338 тис.4 роки тому

Why do some ignore AI Safety? Let's look at 10 reasons people give (adapted from Stuart Russell's list). Related Videos from Me: Why Would AI Want to do Bad Things? Instrumental Convergence: ua-cam.com/video/ZeecOKBus3Q/v-deo.html Intelligence and Stupidity: The Orthogonality Thesis: ua-cam.com/video/hEUO6pjwFOo/v-deo.html Predicting AI: RIP Prof. Hubert Dreyfus: ua-cam.com/video/B6Oigy1i3W4/v-...

9:40

9 Examples of Specification Gaming

Переглядів 305 тис.4 роки тому

9 Examples of Specification Gaming

Training AI Without Writing A Reward Function, with Reward Modelling

17:52

Training AI Without Writing A Reward Function, with Reward Modelling

Переглядів 236 тис.4 роки тому

Training AI Without Writing A Reward Function, with Reward Modelling

AI That Doesn't Try Too Hard - Maximizers and Satisficers

10:22

AI That Doesn't Try Too Hard - Maximizers and Satisficers

Переглядів 203 тис.4 роки тому

AI That Doesn't Try Too Hard - Maximizers and Satisficers

13:41

Is AI Safety a Pascal's Mugging?

Переглядів 371 тис.5 років тому

Is AI Safety a Pascal's Mugging?

15:38

A Response to Steven Pinker on AI

Переглядів 206 тис.5 років тому

A Response to Steven Pinker on AI

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

11:32

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

Переглядів 169 тис.5 років тому

How to Keep Improving When You're Better Than Any Teacher - Iterated Distillation and Amplification

Why Not Just: Think of AGI Like a Corporation?

15:27

Why Not Just: Think of AGI Like a Corporation?

Переглядів 155 тис.5 років тому

Why Not Just: Think of AGI Like a Corporation?

Safe Exploration: Concrete Problems in AI Safety Part 6

13:46

Safe Exploration: Concrete Problems in AI Safety Part 6

Переглядів 96 тис.5 років тому

Safe Exploration: Concrete Problems in AI Safety Part 6

Friend or Foe? AI Safety Gridworlds extra bit

3:47

Friend or Foe? AI Safety Gridworlds extra bit

Переглядів 42 тис.6 років тому

Friend or Foe? AI Safety Gridworlds extra bit

7:23

AI Safety Gridworlds

Переглядів 92 тис.6 років тому

AI Safety Gridworlds

Experts' Predictions about the Future of AI

6:47

Experts' Predictions about the Future of AI

Переглядів 80 тис.6 років тому

Experts' Predictions about the Future of AI

Why Would AI Want to do Bad Things? Instrumental Convergence

10:36

Why Would AI Want to do Bad Things? Instrumental Convergence

Переглядів 247 тис.6 років тому

Why Would AI Want to do Bad Things? Instrumental Convergence

Superintelligence Mod for Civilization V

1:04:40

Superintelligence Mod for Civilization V

Переглядів 70 тис.6 років тому

Superintelligence Mod for Civilization V

Intelligence and Stupidity: The Orthogonality Thesis

13:03

Intelligence and Stupidity: The Orthogonality Thesis

Переглядів 667 тис.6 років тому

Intelligence and Stupidity: The Orthogonality Thesis

Scalable Supervision: Concrete Problems in AI Safety Part 5

5:03

Scalable Supervision: Concrete Problems in AI Safety Part 5

Переглядів 52 тис.6 років тому

Scalable Supervision: Concrete Problems in AI Safety Part 5

5:30

AI Safety at EAGlobal2017 Conference

Переглядів 19 тис.6 років тому

AI Safety at EAGlobal2017 Conference

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

5:20

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

Переглядів 48 тис.6 років тому

AI learns to Create ̵K̵Z̵F̵ ̵V̵i̵d̵e̵o̵s̵ Cat Pictures: Papers in Two Minutes #1

10:41

What can AGI do? I/O and Speed

Переглядів 118 тис.6 років тому

What can AGI do? I/O and Speed

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

9:38

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

Переглядів 113 тис.6 років тому

What Can We Do About Reward Hacking?: Concrete Problems in AI Safety Part 4

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

7:32

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

Переглядів 91 тис.6 років тому

Reward Hacking Reloaded: Concrete Problems in AI Safety Part 3.5

The other "Killer Robot Arms Race" Elon Musk should worry about

5:51

The other "Killer Robot Arms Race" Elon Musk should worry about

Переглядів 99 тис.6 років тому

The other "Killer Robot Arms Race" Elon Musk should worry about

Robert Miles AI Safety