|
Re: Q learning instead of PIDS
You are right that most q learning algorithms involve searching for the best sequence of finite moves, but there is plenty of work done with a continuous state and action. The best example I can think of is with robot navigation using q learning.
Even with the discrete amount of steps, it still works. (I know first hand). What essentially happens is that its initial state is always being updated (as fast as the sensor(s) can update it at least). What I did was put a delay on the input, 5hz. I gave the learning algorithm 100ms to find the best action(s), another 100ms to execute it, then repeat.
__________________
"You're a gentleman," they used to say to him. "You shouldn't have gone murdering people with a hatchet; that's no occupation for a gentleman."
|