Summary: | Nowadays, the ocean numerical models are gradually developing towards multi-physical process and high resolution, with the increment of measured ocean data and more in-depth research in ocean field. Therefore, general computing capability is no longer able to meet these models' needs. It is necessary to utilize more powerful hardware and parallel software to process the ocean numerical model programs. China has made great process in the research and development of homegrown high performance processors, and sunway sw26010 many-core processor is the most outstanding representative. This paper focuses the lag of the ocean numerical model software matched with homegrown processors, and makes parallel implementation and optimization to regional ocean modeling system (ROMS) based on sunway sw26010 many-core processor for the first time. Furthermore, three kinds of programming methods are utilized in this paper, including OpenACC*, athread with fortran and athread with C. The comparison among these programming methods has been made, from programming method, workload and execution efficiency, which has a practical guiding significance for the programmers that use sunway sw26010 manycore processors. The evaluation measures the execution times and speedups of model kernel and total ROMS with different optimizations, input datasets and numbers of computing processing elements (CPEs). The result shows that, to compare with original ROMS, the speedup of optimized hotspot program can be up to 3.69x.
|