Page 25 - Read Online
P. 25
Zander et al. Complex Eng Syst 2023;3:9 I http://dx.doi.org/10.20517/ces.2023.11 Page 7 of 16
Algorithm 1 Asteroid Smasher Iterative Learning
1: Initialize FIS
2: Set learning rate
3: for each epoch do
4: Create/choose a scenario
5: Test scenario with the current FIS and save score in the interval [0, 1]
6: for i in range(k) do
7: Create a alteration vector h with adjustments to each TSK output parameter
8: Create a new fuzzy system by adding h to the baseline’s parameters
9: Test scenario with altered fuzzy system and save score in the interval [0, 1]
10: end for
11: Calculate maximum score difference = −
12: Create the new current FIS with = + ( ∗ ∗ ℎ)
13: end for
In other words, we created a number of slightly modified fuzzy systems per scenario and borrowed slightly
from the alterations made to the best performer. The degree to which the modification is added to the baseline
is a product of the alteration, the given learning rate, and the difference between the maximum score and
baseline. Some stability was gained by adding the last condition; scores that were only slightly better appeared
to warrant proportionally smaller adjustments.
Scenario creation constitutes an important component of this process. For some problems, this may be trivial,
butothersmayrequireconsiderationoftheproblem,goal,andavailableresources. Totrainforthecompetition,
highly difficult and random scenarios with a very large number of asteroids were generated in lieu of a battery
of scenarios for reasons primarily related to ease of implementation.
3.2.4. Experimental results
Applying the discussed iterative learning method resulted in a story of minor but noticeable performance
increase across key competition metrics such as score, win rate in difficult scenarios, and number of successful
hits on asteroids.
Performance comparisons of the base model, the product of optimization, and a model where TSK outputs
were randomly initialized are shown in Figure 3 and Figure 4. It is important to note that, in this application,
a manually tuned agent served as the starting point of optimization rather than the randomly initialized one.
Evenasizablerollingaverageacrossepochsstillillustratesnotableinstability. Partofthisislikelyattributableto
the randomness innate to the extreme training scenarios. However, testing methods similar to those described
previously for increasing the stability of RL may constitute a key area of further research.
3.2.5. Explainability
An advantage of the FIS is its ability to encapsulate complex behavior in an interface that deals in partial
truths and natural language; functions and interactions that would prove difficult to interpret otherwise are
meaningfully organized in comprehensible fuzzy sets and rules. More fundamentally, dealing with partial
truths can prove more familiar and intuitive than crisp logic to a human expert.
In this case, the developed systems are comprised of a natural language rule set and associated fuzzy sets that
ℎ
lend themselves to visualization. Figure 5 illustrates this with plots of antecedent fuzzy sets and 0 order
outputs describing varying levels of ship thrust. Included is the associated rule set in which the relationships