English
全部
搜索
图片
视频
地图
资讯
Copilot
更多
购物
航班
旅游
笔记本
Top stories
Sports
U.S.
Local
World
Science
Technology
Entertainment
Business
More
Politics
时间不限
过去 1 小时
过去 24 小时
过去 7 天
过去 30 天
最佳匹配
最新
生物通
3 个月
评估大语言模型(LLMs)在可解释的深度强化学习(explainable deep ...
本文评估了CoT、MCTS增强和SFT三种方法在生成强化学习解释中的效果,发现MCTS显著提升大模型在复杂环境(如Lunar Lander)的解释质量,而SFT对中小模型更有效。通过LLMs作为评判者,验证了自动化评估框架与人工评估高度一致(Cohen's κ=0.77,Spearman ρ=0.88)。
一些您可能无法访问的结果已被隐去。
显示无法访问的结果
今日热点
MI synagogue shooting
Issues first statement
US military plane crashes
Disney appoints Paul Roeder
WDs from Player Championship
Court voids most of injunction
US trade deficit narrows
Recalls 550K vehicles
US embassy blast arrests
Head chef Redzepi resigns
UFC fighters to train agents
WC security deal reached
Out as Arizona State coach
To close Peshawar consulate
Dolphins to sign Tutu Atwell
Howard announces retirement
Legendary SiriusXM DJ dies
Shooting at ODU in Virginia
China passes ethnic unity law
Trump admin sues California
Paul joins Trump in KY rally
Ohio State appoints new pres
Fails to advance DHS funding
Landslides sweep Ethiopia
US jobless claims fall
SK OKs US investment bill
To cut 1,600 jobs
PWHL announces TV debut
Clyburn seeks another term
Tornado kills 2 in Indiana
反馈