Complex system recovery by process programming redundancy

This paper presents the recovery of a control system resistant against faults. We come out from parallel computer system with distributed memory and communication based upon exchange of messages. This system consists of processor elements, communication lines and switches. At least one application p...

Full description

Bibliographic Details
Main Author: Vokorokos Liberios
Format: Article
Language:English
Published: Technical University of Kosice 2000-12-01
Series:Acta Montanistica Slovaca
Subjects:
Online Access:http://actamont.tuke.sk/pdf/2000/n4/9vokorokos.pdf
id doaj-e019202924ff4740822ffb1d9fb18396
record_format Article
spelling doaj-e019202924ff4740822ffb1d9fb183962020-11-25T00:47:23ZengTechnical University of Kosice Acta Montanistica Slovaca1335-17882000-12-0154383386Complex system recovery by process programming redundancyVokorokos LiberiosThis paper presents the recovery of a control system resistant against faults. We come out from parallel computer system with distributed memory and communication based upon exchange of messages. This system consists of processor elements, communication lines and switches. At least one application process is running on each of the processor of parallel system. Processes are executed parallely and sequently, communicating with each other through the communication lines executing one task. Several tasks can be run on the parallel system. Processes are mapped to the processor elements.This applied method of system endurance against fault is ensured on the level of processor elements, communication lines, switches and processes using software and hardware redundancy. The purpose of the recovery in fault tolerant parallel system is to create and insure system supporting against fault after its appearing. Resistance against faults is ensured by the applied method of a fault tolerant system.The paper describes the function of the system after system fault. Faults in different parts of parallel system have different importance. Lets think about a fault processor, line or switch. The most important is fault on processor. In this case the processes allocated on this processor have to be moved to other processor, recovered and initialled one more time. Usually we can think about that processor memory content is lost after fault appearing, or unaccessing. It is necessary to remove and to redirect all communications lines going through this process.The process of system recovery is known. But there is a question how and who controls recovery of kernel of processor. Control can be either centralised or decentralised. There is a question how many copies of processes are enough for sufficient resistance against faults. In case of active and passive processes it depends on requested security. One passive copy of the process is sufficient if we assume, that fault doesnt appear on two processors occupied by the same process at the same time or in time of recovery of the system.http://actamont.tuke.sk/pdf/2000/n4/9vokorokos.pdfredundancyrecoveryprocesslocked processtolerantfault
collection DOAJ
language English
format Article
sources DOAJ
author Vokorokos Liberios
spellingShingle Vokorokos Liberios
Complex system recovery by process programming redundancy
Acta Montanistica Slovaca
redundancy
recovery
process
locked process
tolerant
fault
author_facet Vokorokos Liberios
author_sort Vokorokos Liberios
title Complex system recovery by process programming redundancy
title_short Complex system recovery by process programming redundancy
title_full Complex system recovery by process programming redundancy
title_fullStr Complex system recovery by process programming redundancy
title_full_unstemmed Complex system recovery by process programming redundancy
title_sort complex system recovery by process programming redundancy
publisher Technical University of Kosice
series Acta Montanistica Slovaca
issn 1335-1788
publishDate 2000-12-01
description This paper presents the recovery of a control system resistant against faults. We come out from parallel computer system with distributed memory and communication based upon exchange of messages. This system consists of processor elements, communication lines and switches. At least one application process is running on each of the processor of parallel system. Processes are executed parallely and sequently, communicating with each other through the communication lines executing one task. Several tasks can be run on the parallel system. Processes are mapped to the processor elements.This applied method of system endurance against fault is ensured on the level of processor elements, communication lines, switches and processes using software and hardware redundancy. The purpose of the recovery in fault tolerant parallel system is to create and insure system supporting against fault after its appearing. Resistance against faults is ensured by the applied method of a fault tolerant system.The paper describes the function of the system after system fault. Faults in different parts of parallel system have different importance. Lets think about a fault processor, line or switch. The most important is fault on processor. In this case the processes allocated on this processor have to be moved to other processor, recovered and initialled one more time. Usually we can think about that processor memory content is lost after fault appearing, or unaccessing. It is necessary to remove and to redirect all communications lines going through this process.The process of system recovery is known. But there is a question how and who controls recovery of kernel of processor. Control can be either centralised or decentralised. There is a question how many copies of processes are enough for sufficient resistance against faults. In case of active and passive processes it depends on requested security. One passive copy of the process is sufficient if we assume, that fault doesnt appear on two processors occupied by the same process at the same time or in time of recovery of the system.
topic redundancy
recovery
process
locked process
tolerant
fault
url http://actamont.tuke.sk/pdf/2000/n4/9vokorokos.pdf
work_keys_str_mv AT vokorokosliberios complexsystemrecoverybyprocessprogrammingredundancy
_version_ 1725260286132224000