Page 81 - Read Online

P. 81

Page 265 Liu et al. Intell Robot 2024;4(3):256-75 I http://dx.doi.org/10.20517/ir.2024.17

Figure 4. Flow chart of parallel solution for .

It can be seen from the formulas that and are obtained by summing two matrices, and parallel com-
puting with CUDA is very suitable for matrix superposition.

For the example of the coefficient , we copy to the corresponding position in as its initialized
structure, and take the number of non-zero blocks of as the number of threads, and each thread performs
a copy operation. Assuming that the number of camera poses is m and the number of map points is n, the
can be divided into m*m sub-blocks. The sub-block in the position (i, j) is:

∑
( , ) = ( , ) ∗ −1 ∗ , (11)
( ) ( , )
=1
where ( , ) is the i-th row and j-th column of sub-block , and ( ) is the x-th non-zero of sub-block
. That is, each sub-block of needs to be added n times. Due to the sparse structure of , the number
of calculations will be much less than n. The calculation number of ( , ) is related to the i-th and j-th rows
of . If there are identical column indexes of the non-zero block in row i and row j of , there will be
sum operations. During each iteration, the index of does not change, so all non-zero block indexes can be
pre-calculated in the part of the linear structure construction. Since the calculation of each sub-block in
is independent, and the calculation of each sum-term in the sub-block is also independent, is computed
using a two-level parallel mode, with coarse-grained parallelization among non-zero blocks and fine-grained
parallelization within non-zero blocks, further improving the operation efficiency. The calculation of also
adopts the same strategy. Figure 4 is the flow chart for solving .

In order to make full use of the limited storage resources of the graphics processor, Compressed Sparse Row
(CSR) format is adopted for sparse matrix to store data. Compared with common matrix storage, CSR
format can avoid the waste of space. Finally, the paper uses the linear solver cuSolver provided by CUDA to
solve the incremental equation.

76 77 78 79 80 81 82 83 84 85 86