Page 38 - Read Online
P. 38

Ernest et al. Complex Eng Syst 2023;3:4  I http://dx.doi.org/10.20517/ces.2022.54  Page 19 of 22




































               Figure 16. The four frames from the mission failure,    4-adherent siege tank firing control scenario are (Upper Left) scenario start showing
               siege tank taking a safe shot, (Upper Right) siege tank does not fire on zergling group in proximity to top marine, (Lower Left) top marine is
               defeated by zergling group and siege tank continues holding fire, and (Lower Right) mission failure because all marines defeated and siege
               tank cannot target nearby units.


               Output MFs referenced in the RB, as well as in the Shot Effectiveness/Safety Comparison FIS. A version of
               the modified training chromosome was manually tuned to the point where adherence to this specification was
               proven by JKIND. Figure 16 below shows the performance of this new system, which essentially opts out of
               taking a meaningful shot at Frame 2 when the hostile army is heavily grouped around the top marine. While
               adherent to the spec, this mission ends in failure.




               4. DISCUSSION
               AscanbeseenintheResultssection,theperformanceoftheAIsystemafterthereinforcementlearningprocess
               was such that it was likely superhuman [25] . Note that this is heavily dependent on the hypothetical human
               player. As the actions per minute of the AI system was not constrained, it could select and execute much
               faster than any human likely could. Note that this is also somewhat different from other SC2 AI systems like
               AlphaStar, as those are constrained to have near-human constraints on information and action execution rates.
               The purpose of the AI in this work though is to perform a subset of actions in particular scenarios and is not
               meant to be a general SC2 player. Instead, at least for specifications 1 through 3, the focus was on creating a
               high performance AI for unit controls in difficult engagements that were also explainable and formally proven
               to adhere to safety specifications.


               With respect to the safety specifications, it was shown in both simulation and through verification with the
               model checker JKIND that counterexamples were found for most of the specifications. That is, there were
               conditions where the system could output actions that would violate the specifications. The counterexamples,
               due to the transparent and explainable nature of GFTs, were then used to assist in identifying parameters that
               needed correction. This was achieved by expert knowledge, specifically familiarity with Starcraft 2, but could
               also be aided via further reinforcement learning runs to correct undesired behaviors. After correction, the
   33   34   35   36   37   38   39   40   41   42   43