您当前的位置：首页 > ar

【强化学习】之Q-Learning

FPGA硅农发布时间：2021-12-15 14:47:05 ，浏览量：1

问题描述

在这里插入图片描述如图所示，从左上角出发，每次只能往上下左右四个方向移动1个单位，要求设计一个路径，尽可能避免红色障碍，到达蓝色方格处。首先，我们将问题抽象化，25个方格位置分别编号为0-24，表示25个状态,上、下、左、右为四个动作,如下图所示：在这里插入图片描述

奖赏机制

def get_init_feedback_table(S,a):
    tab=np.ones((25,4))
    tab[8][1]=-10;tab[4][3]=-10;tab[14][2]=-10
    tab[11][1]=-10;tab[13][0]=-10;tab[7][3]=-10;tab[17][2]=-10
    tab[16][0]=-10;tab[20][2]=-10;tab[10][3]=-10;
    tab[18][0]=-10;tab[16][1]=-10;tab[22][2]=-10;tab[12][3]=-10
    tab[23][1]=50;tab[19][3]=50
    return tab[S,a]

如代码所示，当某个动作导致下一状态为红色障碍物时，R=-10，若进入蓝色终点，则R=50，否则R=1，需要注意的是，这里的R和Q表的Q(S,A)是不一样的，R是状态S采用动作A后得到的即时奖励。

Q-learning算法

在这里插入图片描述

实验代码

import numpy as np
import pandas as pd
import time

N_STATES = 25   # the length of the 2 dimensional world
ACTIONS = ['left', 'right','up','down']     # available actions
EPSILON = 0.3   # greedy police
ALPHA = 0.8     # learning rate
GAMMA = 0.9    # discount factor
MAX_EPISODES = 1000   # maximum episodes
FRESH_TIME = 0.00001    # fresh time for one move

def build_q_table(n_states, actions):
    table = pd.DataFrame(
        np.zeros((n_states, len(actions))),     # q_table initial values
        columns=actions,    # actions's name
    )
    return table

def choose_action(state, q_table):
    state_actions = q_table.iloc[state, :]
    if (np.random.uniform() > EPSILON) or ((state_actions == 0).all()):  # act non-greedy or state-action have no value
        if state==0:
            action_name=np.random.choice(['right','down'])
        elif state>0 and state20 and state


    
        
            
                
                
                    FPGA硅农
                    暂无认证
                
            
            
                
                    
                        1浏览
                        0关注
                        244博文
                        0收益
                    

                    
                        0浏览
                        0点赞
                        0打赏
                        0留言
                    
                
            
            
                私信
                关注
            

        
        
            热门博文
            
                ASIC和FPGA设计流程
Karatsuba大数乘法的Verilog实现
Verilog实现占空比为5/18的9分频
【数字IC/FPGA】热独码检测
按键消抖的Verilog实现
FIR滤波器的Verilog实现
System Verilog实现优先级仲裁器
数字IC手撕代码--投票表决器
单端口RAM实现FIFO
【数字IC/FPGA】检测最后一个匹配序列的位置






    [ 申请 ]友情链接：
    
        快连vpn
        快连
        快连vpn
        搜外友链
        笔趣阁
        爱思助手
        ClashX教程
        绘画宝宝
        配音宝宝
    


    
        
            关于我们
            服务条款
            广告服务
            联系我们
            网站地图
            免责声明
            WAP
        
        技术支持：
            武汉快勤科技有限公司
            XML网站地图 
            备案号：鄂ICP备18027844号-9
            
        
    




    
        立即登录/注册
        
    
    
        
        微信扫码登录
    












	    基本
        文件
        流程
        错误
        SQL
        调试
    

		    
    
	请求信息 : 2025-08-19 17:40:41 HTTP/2.0 GET : /home/article/detail/id/339488.html
运行时间 : 0.0496s ( Load:0.0121s Init:0.0014s Exec:0.0233s Template:0.0127s )
吞吐率 : 20.16req/s
内存开销 : 1,926.87 kb
查询信息 : 18 queries 0 writes 
文件加载 : 36
缓存信息 : 5 gets 0 writes 
配置加载 : 132
会话信息 : SESSION_ID=tfvla8t5cvkc85fisqb0rqt233
    
    
        
    
	/www/wwwroot/www.chaojiit.com/index.php ( 1.30 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/ThinkPHP.php ( 4.71 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Think.class.php ( 12.32 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Storage.class.php ( 1.38 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Storage/Driver/File.class.php ( 3.56 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Mode/common.php ( 2.82 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Common/functions.php ( 51.07 KB )
/www/wwwroot/www.chaojiit.com/Application/Common/Common/function.php ( 6.83 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Hook.class.php ( 4.02 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/App.class.php ( 12.44 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Dispatcher.class.php ( 15.15 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Route.class.php ( 13.38 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Controller.class.php ( 10.95 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/View.class.php ( 7.96 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/BuildLiteBehavior.class.php ( 3.69 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ParseTemplateBehavior.class.php ( 3.89 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ContentReplaceBehavior.class.php ( 1.93 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Conf/convention.php ( 11.18 KB )
/www/wwwroot/www.chaojiit.com/Application/Common/Conf/config.php ( 1.81 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Lang/zh-cn.php ( 2.57 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Conf/debug.php ( 1.51 KB )
/www/wwwroot/www.chaojiit.com/Application/Home/Conf/config.php ( 0.05 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ReadHtmlCacheBehavior.class.php ( 5.62 KB )
/www/wwwroot/www.chaojiit.com/Application/Home/Controller/ArticleController.class.php ( 6.71 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Model.class.php ( 67.27 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Db.class.php ( 5.70 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Db/Driver/Mysql.class.php ( 8.73 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Db/Driver.class.php ( 41.60 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Cache.class.php ( 3.84 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Cache/Driver/File.class.php ( 5.90 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Template.class.php ( 28.35 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Template/TagLib/Cx.class.php ( 22.62 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Template/TagLib.class.php ( 9.19 KB )
/www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php ( 14.55 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/WriteHtmlCacheBehavior.class.php ( 1.43 KB )
/www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Behavior/ShowPageTraceBehavior.class.php ( 5.27 KB )
    
    
        
    
	[ app_init ] --START--
Run Behavior\BuildLiteBehavior [ RunTime:0.000013s ]
[ app_init ] --END-- [ RunTime:0.000058s ]
[ app_begin ] --START--
Run Behavior\ReadHtmlCacheBehavior [ RunTime:0.000511s ]
[ app_begin ] --END-- [ RunTime:0.000550s ]
[ view_parse ] --START--
[ template_filter ] --START--
Run Behavior\ContentReplaceBehavior [ RunTime:0.000100s ]
[ template_filter ] --END-- [ RunTime:0.000196s ]
Run Behavior\ParseTemplateBehavior [ RunTime:0.009907s ]
[ view_parse ] --END-- [ RunTime:0.009973s ]
[ view_filter ] --START--
Run Behavior\WriteHtmlCacheBehavior [ RunTime:0.000213s ]
[ view_filter ] --END-- [ RunTime:0.000229s ]
[ app_end ] --START--
    
    
        
    
	[2] session_save_path(): open_basedir restriction in effect. File(/var/lib/php/session) is not within the allowed path(s): (/www/wwwroot/www.chaojiit.com/:/tmp/) /www/wwwroot/www.chaojiit.com/ThinkPHP/Common/functions.php 第 1239 行.
[8192] Array and string offset access syntax with curly braces is deprecated /www/wwwroot/www.chaojiit.com/ThinkPHP/Library/Think/Cache/Driver/File.class.php 第 59 行.
[8] Undefined variable: user /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 37 行.
[8] Undefined variable: user /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 97 行.
[8] Trying to access array offset on value of type null /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 97 行.
[8] Undefined variable: user /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 98 行.
[8] Trying to access array offset on value of type null /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 98 行.
[8] Undefined variable: pinglun_list /www/wwwroot/www.chaojiit.com/Application/Runtime/Cache/Home/3c8a1a47a3534a7b1252c226abfc3928.php 第 107 行.
    
    
        
    
	SHOW COLUMNS FROM `configuration` [ RunTime:0.0005s ]
SELECT `value` FROM `configuration` WHERE `name` = 'site_name' LIMIT 1   [ RunTime:0.0001s ]
SHOW COLUMNS FROM `menu` [ RunTime:0.0006s ]
SELECT * FROM `menu` WHERE `fid` = 0 AND `status` = 1  [ RunTime:0.0002s ]
SELECT * FROM `menu` WHERE `fid` = 1 AND `status` = 1  [ RunTime:0.0001s ]
SELECT * FROM `menu` WHERE `fid` = 2 AND `status` = 1  [ RunTime:0.0001s ]
SELECT * FROM `menu` WHERE `fid` = 3 AND `status` = 1  [ RunTime:0.0001s ]
SELECT * FROM `menu` WHERE `fid` = 4 AND `status` = 1  [ RunTime:0.0001s ]
SHOW COLUMNS FROM `article` [ RunTime:0.0005s ]
SELECT * FROM `article` WHERE `id` = 339488 LIMIT 1   [ RunTime:0.0010s ]
SHOW COLUMNS FROM `bloger` [ RunTime:0.0010s ]
SELECT * FROM `bloger` WHERE `id` = 423 LIMIT 1   [ RunTime:0.0003s ]
SELECT COUNT(*) AS tp_count FROM `article` WHERE `bloger_id` = 423 LIMIT 1   [ RunTime:0.0001s ]
SHOW COLUMNS FROM `article_content` [ RunTime:0.0004s ]
SELECT `content` FROM `article_content` WHERE `article_id` = 339488 LIMIT 1   [ RunTime:0.0009s ]
SHOW COLUMNS FROM `article_cate` [ RunTime:0.0010s ]
SELECT `name` FROM `article_cate` WHERE `id` = 343 LIMIT 1   [ RunTime:0.0001s ]
SELECT * FROM `article` WHERE `bloger_id` = 423 ORDER BY view_count desc LIMIT 0,10   [ RunTime:0.0003s ]
    
    
        
    
	    
    
    



0.0496s