Page 38 - Read Online
P. 38
Ernest et al. Complex Eng Syst 2023;3:4 I http://dx.doi.org/10.20517/ces.2022.54 Page 19 of 22
Figure 16. The four frames from the mission failure, 4-adherent siege tank firing control scenario are (Upper Left) scenario start showing
siege tank taking a safe shot, (Upper Right) siege tank does not fire on zergling group in proximity to top marine, (Lower Left) top marine is
defeated by zergling group and siege tank continues holding fire, and (Lower Right) mission failure because all marines defeated and siege
tank cannot target nearby units.
Output MFs referenced in the RB, as well as in the Shot Effectiveness/Safety Comparison FIS. A version of
the modified training chromosome was manually tuned to the point where adherence to this specification was
proven by JKIND. Figure 16 below shows the performance of this new system, which essentially opts out of
taking a meaningful shot at Frame 2 when the hostile army is heavily grouped around the top marine. While
adherent to the spec, this mission ends in failure.
4. DISCUSSION
AscanbeseenintheResultssection,theperformanceoftheAIsystemafterthereinforcementlearningprocess
was such that it was likely superhuman [25] . Note that this is heavily dependent on the hypothetical human
player. As the actions per minute of the AI system was not constrained, it could select and execute much
faster than any human likely could. Note that this is also somewhat different from other SC2 AI systems like
AlphaStar, as those are constrained to have near-human constraints on information and action execution rates.
The purpose of the AI in this work though is to perform a subset of actions in particular scenarios and is not
meant to be a general SC2 player. Instead, at least for specifications 1 through 3, the focus was on creating a
high performance AI for unit controls in difficult engagements that were also explainable and formally proven
to adhere to safety specifications.
With respect to the safety specifications, it was shown in both simulation and through verification with the
model checker JKIND that counterexamples were found for most of the specifications. That is, there were
conditions where the system could output actions that would violate the specifications. The counterexamples,
due to the transparent and explainable nature of GFTs, were then used to assist in identifying parameters that
needed correction. This was achieved by expert knowledge, specifically familiarity with Starcraft 2, but could
also be aided via further reinforcement learning runs to correct undesired behaviors. After correction, the