Open problems in RL

Challenges with core algorithms:

Challenges with assumptions:

Stability

Sample Efficiency

  • Need to wait for a long time for your homework to finish running
  • Real-world learning becomes difficult or impractical
  • Precludes the use of expensive, high-fidelity simulators
  • Limits applicability to real-world problems

Scaling up deep RL & Generalization

Supervised Learning (with Imagenet)

  • Large-scale
  • Emphasizes diversity
  • Evaluated on generalization

RL:

  • Small-scale
  • Emphasizes mastery
  • Evaluated on performance

Supervision

Where do supervision come from?

Rethinking the problem formulation

Some perspectives