Dependability

The courses below concentrate on the issues of reliable and fault-tolerant system design.

Courses

CS 425/ECE 428: Distributed Systems

Covers topics needed for a basic understanding of distributed computer systems: protocols, specification techniques, global states and their determination, reliable broadcast, transactions and commitment, security, and real-time systems. Prerequisite is CS 241 or ECE 391. Available Fall 2012.

CS 536/ECE 542: Design of Fault-Tolerant Digital Systems

This course introduces a system (hardware and software) view of design issues in reliable computing. The material represents a broad spectrum of hardware and software error detection and recovery techniques. The lectures discuss how these techniques interplay; e.g., which techniques can be provided in hardware, operating system, and network communication layers, and what can be provided via a distributed software layer and in the application itself. Prerequisite is ECE 411.