Using graphics processor on B-Spline finite element analysis

碩士 === 國立成功大學 === 機械工程學系碩博士班 === 95 === This paper discusses how to use the graphics processor on B-Spline finite element analysis and studies the performance analysis. The basic linear algebraic arithmetic operations on the graphics processor, such as vector inner product, vector-vector addition an...

Full description

Bibliographic Details
Main Authors: Wu-Yung Chen, 陳武勇
Other Authors: Shi-Pin Ho
Format: Others
Language:zh-TW
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/58206200551958860054
Description
Summary:碩士 === 國立成功大學 === 機械工程學系碩博士班 === 95 === This paper discusses how to use the graphics processor on B-Spline finite element analysis and studies the performance analysis. The basic linear algebraic arithmetic operations on the graphics processor, such as vector inner product, vector-vector addition and multiplication, full matrix-vector multiplication, full matrix-matrix multiplication and sparse matrix-vector multiplication are studied in this paper. The performances are improved when executed by different threads and compared with the CUBLAS(Basic Linear Algebraic Subprogram)functions. Beside, the data of sparse matrix’s non zero elements are sorted to gain the performance when executed sparse matrix-vector multiplication. Finally, I used the iterative method to solve a B-Spline finite element problem. The Jacobi precondition conjugate gradient method is used in the iterative method. At the conclusion of finite element problem, the accuracy of the single precision floating point operation in graphics processor can’t absolutely converge to the result when compared with central processing unit that used the double precision floating point operation. The errors are about 0.04~2 %. The graphics processor used is the GeForce 8800 GTX, which is introduced in the beginning of this year by Nvidia company. GeForce 8800 GTX includes 128 stream processors and operates on SIMD(Single Instruction Multiple Data)parallel model. The peak performance can reach about 350Gflops and the bandwidth of data transfer is 86.5GB/s. The GeForce 8800 GTX has a new architecture called CUDA(Compute Unified Device Architecture) that is integrated and independent of GeForce 8800 GTX stream processor’s calculation capability. When compared with the traditional graphics processor, the GeForce 8800 GTX has less limitation to use the 128 stream processors.