publication croisée depuis : https://lemmy.pierre-couy.fr/post/1059609

Hey everyone!

I’ve been working on my own toy reinforcement learning (RL) framework for a while now and have nearly implemented a full Rainbow agent—though I’m still missing the distributional component due to some design choices that make integration tricky. Along the way, I’ve used this framework to experiment with various concepts, mainly reward normalization strategies and exploration policies.

I started by training the agent on simpler games like Snake, but things got really interesting when I moved on to Super Mario Bros. Watching the agent learn and improve has been incredibly fun, so I figured—why not share the experience? That’s why I’m streaming the learning process live!

Right now, the stream is fairly simple, but I plan to enhance it with overlays showing key details about the training run—such as hyperparameters, training steps/episodes, performance graphs, and maybe even a way to visualize the agent’s actions in real-time.

If you have any ideas on how to make the stream more engaging, or if you’re curious about the implementation, feel free to ask!

  • pcouy@lemmy.pierre-couy.frOP
    link
    fedilink
    arrow-up
    1
    ·
    13 days ago

    When the agent is stuck or does nothing, it often keeps doing nothing until it times out. I’m adding a shorter time limit so it spends a little less time being stuck over the whole training