Introduction to Distributed Computing Systems
Lecture 1
Definition
- Distributed System
- Tannenbaum
- A distributed system is a collection of independent computers that appears to its users as a single coherent system
- Lamport
- You know you have one when the crash of a computer you've never heard of stops you from getting any work done.
- Applications that consist of a set of processes that are distributed across a network of machines and work together as an ensemble to solve a common problem.
- Types of Distributed Systems
- Distributed computing systems
- Cluster computing: homogeneous
- Grid computing: heterogenous via virtual organizations
- Cloud computing: everything as a service
- Distributed Information Systems
- Transaction processing system (Transaction RPC)
- Enterprise Information integration
- Publish/Subscribe systems (message oriented v.s. RPC/RMI)
- Distributed Pervasive Systems
- Home systems
- Healthy care systems
- Sensor networks
Characteristic properties of transactions (ACID)
- Atomic
- To the outside world, the transaction happends individually
- Consistent
- The transaction does not violate system invariants
- Isolated
- Concurrent transactions do not interfere with each other.
- Durable
- Once a transaction commits, the changes are permanent.
Goal/Beneftis
- Resource sharing
- Distribution transparency
- Scalability
- Fault tolerance and availability
- Performance
- Parallel computing can be considered a subset of distributed computing.
Challenges
- Heterogeneity
- Need for "openness"
- Open standards: key interfaces in software and communication protocols need to be standardized
- Often defined in Interface Definition Language (IDL)
- Security
- Denial of service attacks
- Scalability
- size
- geographically
- administratively
- Transparency
- Failure handling
- Quality of service
Scalability
- Factors
- Size
- Concerning centralized services/data/algorithm
- Geographically
- Synchronous communication in LAN vs. asynchronous communication in WAN
- Administratively
- Policy conflicts from different organizations (e.g., for security, access control)
- Scalability techniques
- Hiding communication latency
- Asynchronous communication
- Code migration (to client)
- Distribution
- Splitting a large component to parts (e.g., DNS)
- Replication
- Caching (decision of clients vs. servicers)
- On demand (pull) vs. planned (push)
- Interprocess communication
- Socket programming, message passing, etc.
- Remote invocation
- Indirect communication
- Group communication
- Publisher-subscriber
- Message queues
- Tuple spaces
- Client-servier
- Group-oriented/Peer-to-Peer
- Applications that require reliability, scalability
Distributed Software
- Middleware handles heterogeneity
- High-level support
- Make distributed nature of application transparent to the user/programmmer
- Remote procedure callls
- RPC + Object orientation = CORBA
- Higher-level support BUT expose remote objects, partial failure, etc. to the programmer
- Scalability
Fundamental/Abstract Models
- Interaction Model
- Reflects the assumptions about the progresses and the communication channels in the distributed system.
- Failure Model
- Distinguish between the types of failures of the processes and the communication channels.
- Security Model
- Assumptions about the principals and the adversary
Interaction Models
- Synchronous Distributed Systems
- A system in which the following bounds are defined
- The time to execute each step of a process has an upper and lower bound
- Each message transmitted over a channel is received within a known bounded delay.
- Each process has a local clock whose drift rate from real times has a known bound.
- Asynchronous distributed system
- Each step of a process can take an arbitrary time
- Message delivery time is arbitrary
- Clock drift rate are arbirary
- Some implications
- In a synchronous system, timeout can be used to detect failures
- While in asynchrous system, it is impossible to detect failures or "reach aggrement".
No comments:
Post a Comment