视频作者 TwoMinutePapers 给大家讲解这份OpenAI 的研究报告, 报告里说的是给OpenAI 学习在一个带有物理引擎的3D世界里玩捉迷藏。
游戏里,两方都是AI控制的。 一开始全部各体都胡乱跑,没有任何结构。经过几百万轮到模拟和调整后,躲藏者AI开始学会使用里面的方块物件来阻挡捕捉者AI的去路, 然后又几百万轮的模拟后,捕捉者AI 学会用斜坡模块去翻墙, 在经过几百万轮模拟后,躲藏者AI又开始学会预先把斜坡模块藏起来。
接下来,捕捉者AI后又学会使用方块来做滑板,躲藏者也学会在游戏开始前把所有方块上锁。甚至还学会用物理引擎的漏洞把斜坡方块丢出墙外。 最后,捕捉者AI居然学会了开外挂直接翻墙。
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit
Congratulations @tensaix2j! You have completed the following achievement on the Steem blockchain and have been rewarded with new badge(s) :
You can view your badges on your Steem Board and compare to others on the Steem Ranking
If you no longer want to receive notifications, reply to this comment with the word
STOP
Vote for @Steemitboard as a witness to get one more award and increased upvotes!
Downvoting a post can decrease pending rewards and make it less visible. Common reasons:
Submit