Complex system recovery by process programming redundancy
This paper presents the recovery of a control system resistant against faults. We come out from parallel computer system with distributed memory and communication based upon exchange of messages. This system consists of processor elements, communication lines and switches. At least one application p...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
Technical University of Kosice
2000-12-01
|
Series: | Acta Montanistica Slovaca |
Subjects: | |
Online Access: | http://actamont.tuke.sk/pdf/2000/n4/9vokorokos.pdf |
id |
doaj-e019202924ff4740822ffb1d9fb18396 |
---|---|
record_format |
Article |
spelling |
doaj-e019202924ff4740822ffb1d9fb183962020-11-25T00:47:23ZengTechnical University of Kosice Acta Montanistica Slovaca1335-17882000-12-0154383386Complex system recovery by process programming redundancyVokorokos LiberiosThis paper presents the recovery of a control system resistant against faults. We come out from parallel computer system with distributed memory and communication based upon exchange of messages. This system consists of processor elements, communication lines and switches. At least one application process is running on each of the processor of parallel system. Processes are executed parallely and sequently, communicating with each other through the communication lines executing one task. Several tasks can be run on the parallel system. Processes are mapped to the processor elements.This applied method of system endurance against fault is ensured on the level of processor elements, communication lines, switches and processes using software and hardware redundancy. The purpose of the recovery in fault tolerant parallel system is to create and insure system supporting against fault after its appearing. Resistance against faults is ensured by the applied method of a fault tolerant system.The paper describes the function of the system after system fault. Faults in different parts of parallel system have different importance. Lets think about a fault processor, line or switch. The most important is fault on processor. In this case the processes allocated on this processor have to be moved to other processor, recovered and initialled one more time. Usually we can think about that processor memory content is lost after fault appearing, or unaccessing. It is necessary to remove and to redirect all communications lines going through this process.The process of system recovery is known. But there is a question how and who controls recovery of kernel of processor. Control can be either centralised or decentralised. There is a question how many copies of processes are enough for sufficient resistance against faults. In case of active and passive processes it depends on requested security. One passive copy of the process is sufficient if we assume, that fault doesnt appear on two processors occupied by the same process at the same time or in time of recovery of the system.http://actamont.tuke.sk/pdf/2000/n4/9vokorokos.pdfredundancyrecoveryprocesslocked processtolerantfault |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Vokorokos Liberios |
spellingShingle |
Vokorokos Liberios Complex system recovery by process programming redundancy Acta Montanistica Slovaca redundancy recovery process locked process tolerant fault |
author_facet |
Vokorokos Liberios |
author_sort |
Vokorokos Liberios |
title |
Complex system recovery by process programming redundancy |
title_short |
Complex system recovery by process programming redundancy |
title_full |
Complex system recovery by process programming redundancy |
title_fullStr |
Complex system recovery by process programming redundancy |
title_full_unstemmed |
Complex system recovery by process programming redundancy |
title_sort |
complex system recovery by process programming redundancy |
publisher |
Technical University of Kosice |
series |
Acta Montanistica Slovaca |
issn |
1335-1788 |
publishDate |
2000-12-01 |
description |
This paper presents the recovery of a control system resistant against faults. We come out from parallel computer system with distributed memory and communication based upon exchange of messages. This system consists of processor elements, communication lines and switches. At least one application process is running on each of the processor of parallel system. Processes are executed parallely and sequently, communicating with each other through the communication lines executing one task. Several tasks can be run on the parallel system. Processes are mapped to the processor elements.This applied method of system endurance against fault is ensured on the level of processor elements, communication lines, switches and processes using software and hardware redundancy. The purpose of the recovery in fault tolerant parallel system is to create and insure system supporting against fault after its appearing. Resistance against faults is ensured by the applied method of a fault tolerant system.The paper describes the function of the system after system fault. Faults in different parts of parallel system have different importance. Lets think about a fault processor, line or switch. The most important is fault on processor. In this case the processes allocated on this processor have to be moved to other processor, recovered and initialled one more time. Usually we can think about that processor memory content is lost after fault appearing, or unaccessing. It is necessary to remove and to redirect all communications lines going through this process.The process of system recovery is known. But there is a question how and who controls recovery of kernel of processor. Control can be either centralised or decentralised. There is a question how many copies of processes are enough for sufficient resistance against faults. In case of active and passive processes it depends on requested security. One passive copy of the process is sufficient if we assume, that fault doesnt appear on two processors occupied by the same process at the same time or in time of recovery of the system. |
topic |
redundancy recovery process locked process tolerant fault |
url |
http://actamont.tuke.sk/pdf/2000/n4/9vokorokos.pdf |
work_keys_str_mv |
AT vokorokosliberios complexsystemrecoverybyprocessprogrammingredundancy |
_version_ |
1725260286132224000 |