A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods.
Posit is a new way of representing floating points in computers. This thesis investigates the precision of the 32-bit Posit floating point format compared to the current standard 32-bit IEEE 754 Float format by conducting tests with numerical methods. Posit was chosen due to its promising results in...
Main Authors: | , |
---|---|
Format: | Others |
Language: | English |
Published: |
KTH, Skolan för elektroteknik och datavetenskap (EECS)
2020
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280101 |
id |
ndltd-UPSALLA1-oai-DiVA.org-kth-280101 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-kth-2801012020-09-09T05:21:31ZA comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods.engBesseling, JohanRenström, AndersKTH, Skolan för elektroteknik och datavetenskap (EECS)KTH, Skolan för elektroteknik och datavetenskap (EECS)2020Computer SciencesDatavetenskap (datalogi)Posit is a new way of representing floating points in computers. This thesis investigates the precision of the 32-bit Posit floating point format compared to the current standard 32-bit IEEE 754 Float format by conducting tests with numerical methods. Posit was chosen due to its promising results in previous work. The numerical analytical methods that where chosen was the Least Square Method, Gauss Newton Interpolation Method, Trapezoid Method and Newton Raphsons Method. Results from the tests show that Posit32 performs at least as high precision as IEEE 754 Float on computations larger than a range of 1 and above but tends to increase precision up to three significant figures when moving towards a range of 0 - 1. Posit är ett nytt sätt att representera flytande punkter i datorer. Den här avhandlingen undersöker precisionen i 32-bitars Posit-flytpunktsformat jämfört med nuvarande standard 32-bitars IEEE 754 Float-format genom att utföra test med numeriska metoder. Posit valdes på grund av sina lovande resultat i tidigare arbete. De numeriska analysmetoderna som valde var Minstakvadrat metoden, Gauss Newton Interpolation Metod, Trapezoid Metod och Newton Raphsons Metod. Resultaten från testerna visar att Posit32 utför minst lika hög precision som IEEE 754 Float på beräkningar som är större än ett intervall mellan 1 och högre men tenderar att öka precisionen upp till tre värdesiffror när beräkningarna rör sig mot ett intervall mellan 0 - 1. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280101TRITA-EECS-EX ; 2020:346application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Computer Sciences Datavetenskap (datalogi) |
spellingShingle |
Computer Sciences Datavetenskap (datalogi) Besseling, Johan Renström, Anders A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods. |
description |
Posit is a new way of representing floating points in computers. This thesis investigates the precision of the 32-bit Posit floating point format compared to the current standard 32-bit IEEE 754 Float format by conducting tests with numerical methods. Posit was chosen due to its promising results in previous work. The numerical analytical methods that where chosen was the Least Square Method, Gauss Newton Interpolation Method, Trapezoid Method and Newton Raphsons Method. Results from the tests show that Posit32 performs at least as high precision as IEEE 754 Float on computations larger than a range of 1 and above but tends to increase precision up to three significant figures when moving towards a range of 0 - 1. === Posit är ett nytt sätt att representera flytande punkter i datorer. Den här avhandlingen undersöker precisionen i 32-bitars Posit-flytpunktsformat jämfört med nuvarande standard 32-bitars IEEE 754 Float-format genom att utföra test med numeriska metoder. Posit valdes på grund av sina lovande resultat i tidigare arbete. De numeriska analysmetoderna som valde var Minstakvadrat metoden, Gauss Newton Interpolation Metod, Trapezoid Metod och Newton Raphsons Metod. Resultaten från testerna visar att Posit32 utför minst lika hög precision som IEEE 754 Float på beräkningar som är större än ett intervall mellan 1 och högre men tenderar att öka precisionen upp till tre värdesiffror när beräkningarna rör sig mot ett intervall mellan 0 - 1. |
author |
Besseling, Johan Renström, Anders |
author_facet |
Besseling, Johan Renström, Anders |
author_sort |
Besseling, Johan |
title |
A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods. |
title_short |
A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods. |
title_full |
A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods. |
title_fullStr |
A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods. |
title_full_unstemmed |
A comparative study of IEEE 754 32-bit Float and Posit 32-bit floating point format on precision. : Using numerical methods. |
title_sort |
comparative study of ieee 754 32-bit float and posit 32-bit floating point format on precision. : using numerical methods. |
publisher |
KTH, Skolan för elektroteknik och datavetenskap (EECS) |
publishDate |
2020 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-280101 |
work_keys_str_mv |
AT besselingjohan acomparativestudyofieee75432bitfloatandposit32bitfloatingpointformatonprecisionusingnumericalmethods AT renstromanders acomparativestudyofieee75432bitfloatandposit32bitfloatingpointformatonprecisionusingnumericalmethods AT besselingjohan comparativestudyofieee75432bitfloatandposit32bitfloatingpointformatonprecisionusingnumericalmethods AT renstromanders comparativestudyofieee75432bitfloatandposit32bitfloatingpointformatonprecisionusingnumericalmethods |
_version_ |
1719339449585762304 |