CHAPITRE 01 - INTRODUCTION

CHAPITRE 02 - LE PROBLEM DU BANDIT MANCHOT A PLUSIEURS BRAS

CHAPITRE 03 - LES PROCESSUS DE DECISION MARKOVIENS (PDM)

CHAPITRE 04 - LA PROGRAMMATION DYNAMIQUE

CHAPITRE 05 - Méthodes de Monte Carlo

CHAPITRE 06 - APPENTISSAGE PAR DIFFERENCE TEMPORELLE

CHAPITRE 07 - APPRENTISSAGE BASEES SUR LE BOOTSTRAPPING N-STEP

CHAPITRE 08 - Planification et Apprentissage avec des Méthodes Tabulaires

PARTIE 02 - METHODES DE SOLUTION APPROXIMATIVE

CHAPITRE 09 - PREDICTION EN POLITIQUE AVEC APPROXIMATION

CHAPITRE 10 - CONTROLE EN POLITIQUE AVEC APPROXIMATION

CHAPITRE 11 - METHODES HORS-POLITIQUE AVEC APPROXIMATION

CHAPITRE 12 - TRACES D'ELIGIBILITE

CHAPITRE 13 - METHODES DE GRADIENT DE POLITIQUE

PARTIE 03 - EXPLORATION APPROFONDIE

CHAPITRE 14 - LES CONCEPTS PSYCHOLOGIQUES

CHAPITRE 15 - LA NEUROSCIENCE

CHAPITRE 16 - DIVERSES APPLICATIONS PRATIQUES

CHAPITRE 17 - LES FRONTIERES

Exercice Pratique 02

Exercice Pratique 03

Exercice Pratique 04

Exercice Pratique 05

Exercice Pratique 06

Exercice Pratique 07

Exercice Pratique 08

Exercice Pratique 09

Exercice Pratique 10

Exercice Pratique 11

Exercice Pratique 12

Exercice Pratique 13

Exercice Pratique 14

Exercice Pratique 15

Exercice Pratique 16

Exercice Pratique 17

Previous Lesson

Préparation

Reinforcement Préparation

Le Chapitre 2 du module “Reinforcement, traite des problèmes de bandits manchots à plusieurs bras et introduit des concepts fondamentaux pour résoudre ces problèmes.

A k-armed Bandit Problem

Le problème du bandit manchot à plusieurs bras (ou multi-armed bandit) est une situation où un agent doit choisir parmi k options (ou bras), chacune ayant une distribution de récompense inconnue. Le but est de maximiser la récompense totale sur une série d’essais.

Previous Lesson