In the cluster calculation, a double parallelization is made for two loops: spin multiplicity and eigenstates, where the spin multiplicity is one for the spin-unpolarized and non-collinear calculation, and two for the spin-polarized calculation, respectively. The priority of parallelization is in order of spin multiplicity and eigenstates. OpenMX Ver. 3.8 employs ELPA [28] to solve the eigenvalue problem in the cluster calculation, which is a highly parallelized eigevalue solver. Figure 21 (b) shows the speed-up ratio as a function of processors in the elapsed time for a spin-polarized calculation of a single molecular magnet consisting of 148 atoms. The input file 'Mn12.dat' is found in the directory 'work'. It is found that the speed-up ratio is 11 and 17 using 32 and 64 processes, respectively.