Hello, world! - germano.dev

A Reinforcement Learning Riddle

I proved 1=0 starting from the formula for the on-policy distribution in episodic tasks. Obviously there is some mistake, can you spot it? 🤔.

2020-03-03T00:00:00.000Z · 7 min read

I proved 1=0 starting from the formula for the on-policy distribution in episodic tasks. Obviously there is some mistake, can you spot it? 🤔.

2019-01-07T00:00:00.000Z · 7 min read