Page 165 - Read Online
P. 165
Page 4 of 15 Tu et al. Soft Sci 2023;3:25 https://dx.doi.org/10.20517/ss.2023.15
(1) Multimodal inputs first enter sensory cortices networks (including visual, auditory and somatosensory,
gustatory, and olfactory cortices) in a modality selective manner. (2) Then, the sensory cortices innervate
each other and accept information from other modalities instead of only receiving inputs from their
corresponding modalities. These crossmodal projections promote sensory integration processes at an early
stage. In this part, temporal coherence of the multisensory stimuli helps to ensure the selection of relevant
information and bind different modalities for more robust perception [15,28,41-43] . (3) Finally, output projections
from the sensory cortices reach higher association cortices for further multisensory integration [44,45] .
Besides, the bottom-up information in the sensory cortex can be modulated by the top-down signals
representing the internal states of the brain, conveying internal goals or states of the observer [46,47] . Recent
research has shown that task engagement and attention to relevant sensory information can also enhance
sensory responses in lower sensory cortices. This response enhancement can be mediated by direct
projections from the higher-level cortex areas to the sensory cortices [48-50] .
In summary, multisensory perception fusion begins with modality-specific processing of unisensory inputs
with distinctive bottom-up characteristics and is integrated into the higher-order association cortex. On the
other hand, the multisensory integration in the brains can be modulated by top-down attention. The
multisensory processing of the mammalian brain is dynamically mediated, resulting in a unique and
subjective experience of perception. Based on the knowledge of the crossmodal mechanism from the aspect
of neuroscience, some research on multimodal sensing and perception is carried out, which is reviewed as
follows.
MULTIMODAL E-SKIN SENSING SYSTEMS
Previous to the perceptual fusion of e-skins attracting much research attention, considerable efforts have
been put into multimodal sensing integrated wearable systems, holding great potential for applications in
the fields of health monitoring, intelligent prosthetics, HMIs, and humanoid robotics. The biological
information acquired from wearable skin sensors is generally categorized into three main types: physical,
electrophysiological, and chemical signals . The multimodal sensing e-skin systems can be thus classified
[51]
into three modes: (1) Integration of multiple physical sensors; (2) Integration of physical and
electrophysiological sensors; (3) Integration of physical and chemical sensors [Figure 2]. Most of the
multimodal sensing e-skin systems are designed to mimic the functions of human skin, which employ
physical sensors to detect a variety of physical signals, including normal force, lateral strain, vibration,
temperature, and humidity. Not being limited to that, enduing e-skins with sensing modalities beyond
human skin is in great need. To further achieve the next generation of ‘‘smart skins’’, chemical
sensors, electrophysiological sensors, and some physical sensors, such as ultraviolet (UV) light
sensors, are integrated into wearable multifunctional e-skin systems [57,59-61] . Recent works on
multimodal sensing of e-skins are reviewed as follows [Table 1].
The most commonly used approach to multimodal sensing systems is integrating different in-plane or
out-of-plane sensing units . In terms of integrating physical sensors, Ho et al. presented a transparent
[65]
multimodal e-skin sensor matrix in which only graphene and its derivatives were used as functional
materials . Through a low-cost and facile lamination process, humidity, thermal, and pressure sensors
[54]
worked simultaneously and provided output corresponding to the specific external stimulus with negligible
responses to other stimuli. A deformable multimodal ionic receptor was recently presented, which is the
first e-skin system that managed to differentiate strain and temperature information in a single unit . The
[53]
intrinsic features of ion relaxation dynamics (relaxation time and normalized capacitance) were utilized to
decouple these two signals and enable monitoring simultaneously without signal interference. Based on

