(Deprecated) Lisa's Tech Blog: December 2014

Tuesday, December 23, 2014

Introduction to Memory

1. Background

Program must be brought (from disk) to memory and placed within process for it to be run
Main memory and registers are only storage CPU can access directly
Memory unit only sees a stream of addresses + read requests

or address +data and write request

Register access in one CPU clock (or less)
Main memory can take many cycles (stalls)
Cache sit between main memory and CPU registers
Protection of memory required to ensure correct operation

2. Address binding

Addresses represented in different ways at different stages of a program's life

source code addresses usually symbolic
compile code addresses bind to relocated addresses

i.e., 14 bytes from beginning of this module

Linker or loader will bind relocatable addresses to absolute addresses
Each binding maps one address space to another

Address binding of instructions and data to memory addresses can happen at three different stages

compile time: if memory location known a priori, absolute code can be generated; must recompile code if starting location changes
load time: must generate relocatable code if memory location is not known at compile time
execution time: binding delayed until run time if the process can be moved during its execution from one memory segment to another

need hardware support for address maps (e..g, base and limit registers)

3. Multistep Processing of a user program

4. Logical v.s. Physical Address Space

The concept of a logical address space that is bound to a separate physical address space is central to proper memory management

logical address: generated by the CPU; also referred to as virtual address
physical address: address seen by the memory unit

Logical and physical addresses are the same in compile-time and load-time address-binding schemes; logical (virtual) and physical addresses differ in execution-time address-binding scheme.
Logical address space is the set of all logical addresses generated by a program
Physical address space is the set of all physical addresses generated by a program

5. Dynamic loading

routine is not loaded until it is called
better memory-space utilization; unused routine is never loaded
all routines kept on disk is relocatable load format
useful when large amounts of code are needed to handle infrequently occurring cases
no special support from the operation system is required

implemented through program design
OS can help by providing libraries to implement dynamic loading

6. Dynamic Linking

Static linking

system libraries and program code combined by the loader into the binary program image

Dynamic liking

linking postponed until execution time

small piece of code, stub, used to locate the appropriate memory-resident library rountine

Stub replaces itself with the address of the routine, and executes the routine
Operating system checks if routine is in processes' memory address

if not in address space, add to address space

Dynamic linking is particularly useful for libraries

system also known as shared libraries

Consider applicability to patching system libraries

versioning may be needed

7. Base and Limit Registers

A pair of base and limit registers define the logical address space

8. Hardware address protection with base and limit registers

9. Memory Management Unit (MMU)

Hardware device that at run time maps virtual to physical address
The user program deals with logical address; it never sees the real physical addresses

execution-time binding occurs when reference is made to location in memory
logical address bound to physical address

10. Dynamic relocation using a relocation register

Job Scheduling

1. Objective

maximum CPU utilization obtained with multiprogramming
CPU-I/O Burst Cycle- Process execution consist of a cycle of CPU execution and I/O wait

2. CPU Scheduler

Selects from among the processes in ready queue, and allocates the CPU to one of them

queues may be ordered in various ways

CPU scheduling decisions may take place when a process

switches from running to waiting sate
switches from running to ready state
switches from waiting to ready state
terminate

scheduling under 1 and 4 is non-preemptive
all other scheduling is preemptive

consider access to shared data
consider preemption while in kernel mode
consider interrupts occurring during crucial OS activicites

3. Dispatcher

Dispacher module gives control of the CPU to the process selected by the short-term scheduler; this involves

switching context
switching to user mode
jumping to the proper location in the user program to restart that program

Dispatch latency

time it takes for the dispatcher to stop one process and start another running

4. Scheduling Criteria

CPU utilization

keep the CPU as busy as possible

Throughput

# of processes that complete their execution per time unit

Turnaround time

amount of time to execute a particular process

Waiting time

amount of time a process has been waiting in the ready queue

5. First-Come. First-Server (FCFS) Scheduling

6. Shortest-Job-First (SJF) Scheduling

associate with each process the length of its next CPU burst

use these lengths to schedule the process with the shortest time

SJF is optimal

gives minimum average waiting time for a given set of processes

the difficult is knowing the length of the next CPU request

7. Determine the length of next CPU burst

can only estimate the length

should be similar to the previous one
then pick process with shortest predicted next CPU burst

can be done by using the length of previous CPU bursts, using exponential averaging

$t_n$ = average length of the n-th CPU burst
$\tau_{n+1}$ = predicted value of the next CPU burst
$\alpha, 0 \leq \alpha \leq (1-\alpha) \tau_n$

commonly, $\alpha$ set to 1/2
preemptive version called shortest-remaining-time-first

8. Example of Exponential Averaging

$\alpha = 0$

\tau_{n+1} = \tau_n
recent history does not count

$\alpha = 1$

$\tau_{n+1} = n \tau_n$
only the actual last CPU burst counts

If we expand the formula, we get

$\tau_{n+1} = \alpha \tau_n + (1-\alpha) \tau_{n-1} + ...$
$= (1-\alpha)^j \alpha \tau_{n-j} + ... $
$= (1-\alpha)^{n+1} \tau_0$

since both $\alpha$ and $1-\alpha$ are less than or equal to 1, each successive term has less weight than its predecessor

9. Example of Shortest Remaining-time-first

10. Priority Scheduling

A priority number (integer) is associated with each process
The CPU is allocated to the process with the highest priority (smaller integer = highest priority)

preemptive
non-preemptive

SJF is priority scheduling where priority is the inverse of predicted next CPU burst time
Problem = starvation

low priority processes may never execute

Solution = aging

as time progresses, increase the priority of the process

11. Round Robin

Each process gets a small unit of CPU time (time quantum q), usually 10-100 milliseconds.

after this time has elapsed, the process is preempted and added to the end of the ready queue

If there are n processes in the ready queue and the time quantum is q, then each process gets 1/n of the CPU time is chunks of at most q time units at once.

No process waits more than (n-1)q time units.

Time interrupts every quantum to schedule next process
Performance

q large : FIFO
q small: q must be large with respect to context switch, otherwise, overhead is too high

Example of Round robin with time Quantum =4

12. Multilevel Queue

Ready queue is partitioned into separate queues, e..g,

forground (iterative)
background (batch)

Process permanently in a given queue
Each queue has its own scheduling algorithm

forground-RR
background-FCFR

Scheduling must be done between the queues

fixed priority scheduling: i.e., serve all from foreground then from background

possibility starvation

time slices: each queue gets a certain amount of CPU time which it can schedule amongst its processes, i.e., 80% to foreground in RR
20% to background in FCFS

13. Multi-level feedback queue

a process can move between the various queues;
aging can be implemented this way
multi-level-feedback-queue scheduler defined by the following parameters

number of queues
scheduling algorithms for each queue
method used to determine when to upgrade a process
method used to determine when to demote a process
method used to determine which queue a process will enter when that process needs service

Example of multilevel feedback queue

three queues

$Q_0$: time quantum 8 milliseconds
$Q_1$: time quantum 16 milliseconds
$Q_2$; FCFS

scheduling

a new job enters queue $Q_0$ which is served FCFS

when it gains CPU, job received 8 milliseconds
if it does not finish in 8 milliseconds, job is moved to queue $Q_1$

at $Q_1$ job is again served FCFS and receives 16 additional milliseconds

if it still does not complete, it is preempted and moved to queue $Q_2$

14. Thread Scheduling

Distinction between user-level and kernel-level threads
when threads supported, threads scheduled, not processes
many-to-one and many-to-many methods, thread library schedules user-level threads to run on LWP

known as process-contention scope (PCS) since scheduling competition is within the process
typically thread scheduled onto available CPU is system contention scope (SCS)

competition among all threads in the system

15. Multi-Processor Scheduling

CPU scheduling more complex when multiple CPUs are available
Homogeneous processors within a multiprocessor
Asymetric multiprocessing

only one processor accesses the system data structures, alleviating the need for data sharing

Symmetric multiprocessing (SMP)

each processor is self-scheduling, all processes in common ready queue
or each has its own private queue of ready processes

Processor Affinity

process has affinity for processor on which it is currently running

soft affinity
hard affinity
variation including processor sets

16. Multiplecore Processor

recent trend to place multiple processor cores on the same physical chip
faster and consumes lees power
multiple threads per core also growing

takes advantage of memory stall to make progress on another thread while memory retrieve happens

Monday, December 22, 2014

Monitor

1. Monitor

high-level synchronization construct that allows the safe-sharing of an abstract data type among concurrent processes

monitor monitor-name{

shared variable declarations

procedure body p1(){}

procedure body p2() {}

procedure body pn() {}

{

initialization code

}

2. Schematic View of a Monitor

the monitor construct ensures that at most one process can be active within the monitor at a given time
shred data (local variables) of the monitor can be accessed only by local procedure

3. Monitors Introduction

to allow a process to wait within the monitor

a condition variable must be declared, as condition x, y

condition variable can only be used with the operation wait and signal

the operation

x.wait() means that the process invoking this operation is suspended until another process invokes x..signal()
x.signal() operation resumes exactly one suspend process on condition variable x. If no process is suspended on condition variable x, then the signal operation has no effect.
wait and signal operations of the monitors are not the same as semaphore wait and signal operations

4. Monitor with condition variables

when a process P signal to wake up process Q that was waiting on a condition, potentially both of them can be active.
however, monitor rules require that at most one process can be active within the monitor

signal and wait: P waits until Q leaves the monitor
signal and continue: Q waits until P leaves the monitor (or, until P waits for another condition)
Signal and leave: P has to leave the monitor after signaling

the design decision is different for different programming languages

5. Procedure-Consumer Problem With Monitor

6. Dining-Phisolophers Problem with Monitors

Process Synchronization

1. Concurrent access to shared data

Example

suppose that two processes A and B have access to a shared variable "Balance"

process A: Balance = Balance - 100
process B: Balance = Balance - 200

Further, assume that Process A and Process B are executing concurrently in a time-shared, multiprogrammed system
The statement "balance = balance - 100" is implemented by several machine level instructions such as

A1. LOAD R1, BALANCE //load BALANCE from memory to register 1 (R1)
A2. SUB R1, 100 //subtract 100 from R1
A2. STORE BALANCE, R1 //store R1's contents back to the memory location of BALANCE

Similarly, "BALANCE = BALANCE - 200" can be implemented in the following

B1. LOAD R1, BALANCE
B2. SUB R1, 200
B3. STORE BALANCE, R1

2. Race Condition

Observe: in a time-shared system, the exact instruction execution order cannot be predicted
Situations like this, where multiple processes are writing or reading some shared data and the final result depends on who runs precisely when, are called race conditions.
A serious problem for any concurrent system using shared variables.
We must make sure that some high-level code sections are executed atomically.
Atomic operation means that it completes in its entirety without worrying about interruption by an other potentially conflict-causing process.

3. The Critical-Section Problem

n processes are competing to use some shared data
Each process has a code segment, called critical section (critical region), in which the shared data is accessed.
Problem: ensure that when one process is executing in its critical section, no other process is allowed to execute in that critical section.
The execution of the critical sections by the processes must be mutually exclusive in time

4. Solving Critical-Section Problem

Any solution to the problem must satisfy four conditions

Mutual Exclusion

no two processes may be simultaneously inside the same critical section

Bounded Waiting

no process should have to wait forever to enter a critical section

Progress

no process executing a code segment unrelated to a given critical section can block another process trying to enter the same crtical section

Arbitrary speed

no assumption can be made about the relative speed of different processes (though all processes have a non-zero speed)

5. General Structure of a Typical Process

do{

...

entry section

critical section

exit section

remainder section

} while(1);

we assume this structure when evaluating possible solutions to Critical Section Problem
in the entry section, the process requests "permission"
we consider single-processor systems

5. Getting help from hardware

one solution supported by hardware may be to use interrupt capability

do{

DISABLE INTERRUPTS

critical section;

ENABLE INTERRUPTS

remainder section

} while(1);

6. Synchronization Hardware

Many machines provide special hardware instructions that help to achieve mutual exclusion
The TestAndSet (TAS) instruction tests and modifies the content of a memory word automatically
TAS R1, LOCK

reads the contents of the memory workd LOCK into register R!
and stores a nonzero value (e.g. 1) at the memory word LOCK (again, automatically)

assume LOCK = 0;
calling TAS R1, LOCK will set R! to 0, and set LOCK to 1
Assume lOCK = 1
calling TAS R1 LOCK will set R1 to 1, and set LOCK to 1

7. Mutual Exclusion with Test-and-Set

Initially, share memory word LOCK = 0;
Process pi

do{

entry-section:

TAS R1, LOCK

CMP R1, #0

JNE entry_section //if not equal, jump to entry

critical section

MOVE LOCK, #0 //exit section

remainder section

} while(1);

8. Busy waiting and spin locks

This approach is based on busy waiting: if the critical section is being used, waiting processes loop continuously at the entry point
a binary lock variable that uses busy waiting is called "spin lock"

9. Semaphores

Introduced by E.W. Dijkstra
Motivation: avoid busy waiting by locking a process execution until some condition is satisfied
Two operations are defined on a semaphore variable s

wait(s): also called P(s) or down(s)
signal(s): also called V(s) or up(s)

We will assume that these are the only user-visible operations on a semaphore

10. Semaphore Operations

Concurrently a semaphore has an integer value. This value is greater than or equal to 0
wait(s)

wait/block until s.value > 0 //executed atomatically

a process executing the wait operation on a semaphore, with value 0 is blocked until the semaphore's value become greater than 0

no busy waiting

signal(s)

s.value++ //execute automatically

If multiple processes are blocked on the same semaphore "s", only one of them will be awakened when another process performs signal(s) operation
Semaphore as a general synchronization tool: semaphores provide a general process synchronization mechanism beyond the "critical section" problem

11. Deadlocks and Starvation

A set of processes are aid to be in a deadlock state when every process in the set is waiting for an even that can be caused by another process in the set
A process that is forced to wait indefinitely in a synchronization program is said to be subject to starvation

in some execution scenarios, that process does not make any progress
deadlocks imply starvation, bu the reverse is not true

12. Classical problem of synchornization

Produce-Consumer Program
Readers-Writer Problem
Dining-Philosophers Problem
The solution will use only semaphores as synchronization tools and busy waiting is to be avoided.

13. Producer-Consumer Problem

Problem

The bounded-buffer producer-consumer problem assumes there is a buffer of size n
The producer process puts items to the buffer area
The consumer process consumes items from the buffer
The producer and the consumer execute concurrently

Make sure that

the producer and the consumer do not access the buffer area and related variables at the same time
no item is made available to the consumer if all the buffer slots are empty
no slots in the buffer is made available to the producer if all the buffer slots are full

Shared data

semaphore full, empty, mutex

Initially

full = 0; //the number of full buffers
empty = n; //the number of empty buffers
mutex = 1; // semaphore controlling the access to the buffer pool

Producer Process

Consumer Process

14. Readers-Writers Problem

Problem

A data object( e.g. a file) is to be shared among several concurrent processes
A write process must have exclusive access to the data object
Multiple reader processes may access the shared data simultaneously without a problem

Shared data

semaphore mutex, wrt
int readaccount

Initially

mutex = 1
readcount = 0, wrt= 1

Writer Process

Read Process

15. Dining-Philosophers Problem

Problem

five phisolophers share a common circular table
there are five chopsticks and a bowl of rice (in the middle)
when a philosopher gets hungry, he tries to pick up the closest chopsticks
a philosopher may pick up only one chopstick at a time, and he cannot pick up one that is already in use.
when done, he puts down both of his chopsticks, one after the other.

Shared Data

semaphore chopsticks [5]

Initially

all semaphore values are 1

Thread

1. Definition

Idea: allow multiple thread of execution within the same process environment, to a large degree independent of each other.
Multiple threads running in parallel in one process is analogous to having multiple processes running in parallel in one computer.
Multiple threads within a process will share

the address space
open files
other resources

Potential for efficient and close cooperation

2. Multithreading

when a multithread process is run on a single CPU system, the threads take turns running
All threads in the process have exactly the same address space
each thread can be in any one of the several states, just like processes
Each thread has its own stack

3. Benefits

Responsiveness

multithreading an interactive application may allow a program to continue running even if part of it is blocked or performing a lengthy operation

Resource sharing

sharing the address space and other resources may result in high degree of cooperation

Economy

creating/managing processes is much more time consuming than managing threads

Better utilization of multiprocessor architectures

4. Example: a multi-threaded web server

5. Implementing threads

processes usually start with a single thread
usually, library procedures are invoked to manage threads

thread_create: typically specifies the name of the procedure for the new thread to run
thread_exit
thread_join: blocks the calling thread until another (specific) thread has exited
thread_yield: voluntarily gives up the CPU to let another thread run

threads may be implemented in the user space or in the kernel space

6. User-level thread

user threads are supported above the kernel and are implemented by a thread library at the user level
the library (or run-time system) provides support for thread creation, scheduling and management with no support from the kernel
when threads are managed in user space, each process needs its own private thread table to keep track of the threads in that process
the thread-table keeps track only of the per-thread items (program counter, stack pointer, register, state...)
when a thread does something that may cause it to be blocked locally (e.g., wait for another thread), it calls a run time system procedure
if the thread must be put into blocked state, the procedure performs switching

7. User-level advantage

the operating system does not need to support multi-threading
since the kernel is not involved, thread switching may be very fast
each process may have its own customized thread scheduling algorithm
thread scheduler may be implemented in the user space very efficiently

8. User-level threads: problems

the implementation of blocking system calls is highly problematic (e.g., read from the keyboard). All the threads in the process risk being blocked
Possible solutions

change all system-call to non-blocking
sometimes it may be possible to tell in advance if a call will block (e.g., select system call in some version of unix)

jacket code round system calls

9. Kernel-level threads

kernel threads are supported directly by the OS

the kernel provide threat creation, scheduling and management in the kernel space

the kernel has a thread table that keeps track of all threads in the system
all calls that might block a thread are implemented as system call (at greater cost)
when a thread blocks, the kernel may choose another thread from the same process, or a thread from a different process

10. Hybrid Implementation

An alternative solution is to user kernel-level threads, and then multiplex user-level threads onto some or all of the kernel threads
A kernel-level thread has some set of user-level threads that take turns using it

11. Windows XP threads

windows XP support kernel-level threads
the primary data structures of a thread are

ETHREAD (executive thread blocks)

thread start address
pointer to parent process
pointer to corresponding KTHREAD

KTHREAD (kernel thread block)

scheduling and synchronizing information
kernel stack (used when the thread is running in kernel mode)
pointer to TEB

TEB (thread environment block)

thread identifier
user-mode stack
thread-local storage

12. Linux Threads

In addition to fork() system call, Linux provides the clone() system call, which may be used to create threads
Linux users the term task (rather than process or thread) when referring to a flow of control
A set of flags, passed as arguments to the clone() system call determine how much sharing is involved (e.g., open files, memory space, etc).

Process Communication

1. Process Communication

Mechanism for processes to communicate and to synchronize their actions
Two models

communication through a shared memory
communication through message passing

2. Communication through message passing

Message system

processes communicate with each other without resorting to shared variables
A message passing facility must provide at least two operations

send(message, recipient)
receive(message, recipient)

with indirect communication, the message are sent to and received from mailboxes

Messaging passing can be either blocking (synchronous) or non-blocking (asynchronous)

blocking send: the sending process is blocked until the message is received by the receiving process or by the mail box
non-blocking send: the sending process resumes the operation as soon as the message is received by the kernel
blocking receive: the receiver blocks until the message is available
non-blocking receive: receive operation does not block, it either returns a valid message or a default value (null) to indicate a non-existing message

3. Communication through shared memory

The memory region to be shared must be explicitly defined
using system calls, in unix

shmget: creates a shared memory block
shmat: maps an existing shared memory block into a process's address space
shmdt: removes (unmaps) a shared memory block from the process's address space
shmctl: is a general-purpose function allowing various operations on the shared block (receive information about the block, set the permissions, lock in memory)

Problems with simultaneous access to the shared variables
Compilers for concurrent programming languages can provider direct support when declaring variables.

Sunday, December 21, 2014

Virtual Machine

1. Virtual Machines

Originally proposed and implemented for VM Operating System (IBM)
A virtual machine provides an interface identical to the underlying bare hardware
Each user is given its own virtual machine
The operating system creates the illusion of multiple processes, each executing on its sown processor with its own (virtual) memory

Operating System Design Approaches

1. Simple Structure

some operating systems do not have well defined structures. Often these started as simple systems and grew beyond their original scope.
MS-DOS

written to provide the most functionality in the least space

not divided into modules
although MS-DOS has some structure, its interfaces and levels of functionality are not well separated.

1.1. UNIX System Structure

UNIX: limited by hardware functionality, the original UNIX operating system has limited structure.
The Unix OS consists of two separated system parts

system programs
the kernel (everything below the system call interface and above the physical hardware)

provide the file system. CPU scheduling, memory management, and other operating system functions
A large number of functions for one level

2. Layered Approach

The operating system is divided into a number of layers (levels), each build on top of low layers.
The bottom layer (layer 0), is the hardware; the highest (layer N) is the user interface
With modularity, layers are selected such that each users functions (operations) and services of only lower-level layers
Simplifies debugging and system verification

3. Modular Approach

Modular kernel

the kernel has a set of core components
dynamic links in additional services either during boot time or during run-time
common in modern implementations of Unix such as Linux and Solaris

Moves as much as possible from kernel into "user space"
Communication takes space between users modules using "message passing"

Benefits

easier to extend
more reliable (less code is running in kernel mode)
convenient for distributed architectures
security

Many modern OS are designed as microkernels

apple MAC OS (based on Mach OS)
Many SmartPhone OS

Android (L4 Microkernel family)
IPhone OS (based on Mach)

System Call

1. User operating-system Interface

Two main approaches

command-line interpreter (a.k.a. command interpreter or shell)
graphical user interface (GUI)

The shell

allows users to directly enter commands that are to be performed by the operating system
is usually a system program (not part of the kernel)

GUI allows a mouse-based window-and-menu system
Some systems allow (e.g., X-Windows in Unix)

2. System Calls

System calls provide the nterface between the a running program and the operating system

generally available in routines written in C and C++
certain low-level tasks may have to be written using assembly language

Typically, application programmers design programs using an application programming interface (API)
The run-time support system (run-time libraries) provides a system-call interface, that intercepts function calls in the API and invokes the necessary system call within the operating system.

3. Example System-Call Processing

4. System Call: read (fd, buffer, nbvtes)

5. Major System Calls in Unix: Process Management

pid = fork()

create a child process identical to the parent

pid = waitpid( pid, &statloc, options)

wait for a child to terminate

s = execve(name, argv, environp)

repalce a process' core image

exec()

i.e., load the selected program into memory

exit(status)

terminate process execution and return status

s = kill(pid, signal)

send a signal to a process

6. System program

System programs provide a convenient environment for program development
They can provide various services

status information
file modification
programming language support
program loading and execution
communications

System Security

1. Protection and Security

Protection

any mechanism for controlling accesses of processes or users to resources defined by the OS

Security

defense of the system against internal and external attacks

huge range, including denial-of-service, worms, viruses, identity theft, theft of services

System generally first distinguish among users, to determine who can do what

user identities (user IDs, security IDs) include name, and associated number, one per user
user ID then associated with all files, processes of that user to determine access control
group identifier (group id) allows set of users to be defined and controls managed, then also associated with each process, file
privilege escalation allows user to change to effective ID with more rights

I/O System Management

1. I/O System Management

The operating system will hide the peculiarities of specific hardware from the user
In Unix, the I/O subsystem consist of

a buffering, caching and spooling system
a general device-driver interface
drivers for specific hardware devices

Interrupt handlers and device drivers are crucial in the design and efficient I/O subsystems

File Management

1. File Management

A file is a collection of related information designed by its creator
Commonly, files represent programs (both source and object forms) and data
The operating system responsibilities

file creation and deletion
directory creation and deletion
support of primitives for manipulating files and directories
mapping files onto secondary storage
file backup on stable (non-volatile) storage and data

Storage Management

1. Storage Management

OS provides uniform, logical view of information storage

abstracts physical properties to logical storage unit (file)
each medium is controlled by device (i.e., disk drive, tape drive)

varying properties include access speed, capacity, data-transfer rate, access method (sequential or random)

File system management

files usually organized into directories
access control on most file systems to determine who can access what
OS activities include

creating and deleting files and directories
primitives to manipulate files and dirs
mapping files onto secondary storage
backup files onto stable (non-volatile) storage media

2. Storage-Device Hierarchy

3. Caching

Important principle, performed at many levels in a computer (in hardware, operating system, software)
Information in use copied from slower to faster storage temporarily
Faster storage (cache) checked first to determine if information is there

if it is, information used directly from the cache (first)
if not, data copied to cache and used there

Cache smaller than storage being cached

cache management important design problem
cache size and replacement policy

Memory Management

1. Memory Management

all data in memory before and after processing
all instructions in memory in order to execute
memory management determines what is memory when

optimizing CPU utilization and compute response to users

memory management activities

keep track of which parts of memory are currently being used and by whom
deciding which processes (or parts thereof) and data to move into and out of memory
allocating and deallocating memory space as needed

virtual memory management is an essential part of most operating system

Process

1. Definition

Process: a program in execution

process execution must progress in sequential fashion

A program is a passive entity, whereas a process is an active entity with a program counter and a set of associated resources.
Each process has its own address space

Text section (text segment) contains the executable code

the program counter and CPU registers are part of the process context.

Data section (date segment) contains the global variables
Stack contains temporary data (local variables, return addresses...)
A process may contain a heap, which contains memory that is dynamically allocated at run-time

2. OS requirements for processes

OS must interleave the execution of several processes to maximize CPU usage while providing reasonable response time
OS must allocate resources to processes while avoiding deadlock
OS must support inter process communication and user creation of process.

3. A simple implementation of processes

The process index register contains the index into the process list of the currently executing process (B)
A process switch from B to A consist of storing (in memory) B's context and loading (in CPU registers) A's context
A data structure that provides flexibility (to add new features)

4. Process Creation

Principal events that cause process creation

system initialization
execution of a process creation system call by a running process
user request to create new process

Parent process creates child processes, which, in turn create other processes, forming a tree (hierarchy) of processes.
Issues

will the parent and child execute concurrently
how will the address space of the child be related to the parent?
will the parent and child share some resources?

5. An example Process

6. Process Creation in Unix

Each process has a process identifier (pid)
The parent executes fork() system call to spawn a child
The child process has a separate copy of the parent's address space
Both the parent and the child continue execution at the instruction following the fork() system call
Typically, the child executes a system call like execlp() to load a binary file into memory

7. Example program with "fork"

8. Process Termination

process executes lat statement and asks the operating system to delete it (exit)

output data from child to parent (via wait or waitpid)
process' resources are deallocated by operating system

parent may terminate execution of children processes

e.g., TerminationProcess() in Win32()

process may also terminate due to errors
cascading termination: when a system does not allow a child process to continue after the parent has terminated

9. Process States

Running states
Ready states
Blocked states
New state

os has performed the necessary actions to create process but has not yet admitted the process

Exit state

termination moves the process to this state
tables and other info are temporarily preserved for auxiliary program

10. Swapping/Suspending

Processes may need to be swapped out to disk

this is true even with virtual memory

2 new states

blocked suspend: blocked processes which have been swapped out to disk
ready suspend: ready processes which have been swapped out to disk

11. Process Scheduling

The operating system is responsible for managing the scheduling activities

a uniprocessor system can have only one running process at a time
the main memory cannot always accommodate all processes at run-time
The operating system will need to decide on which process to execute next (CPU scheduling), and which processes will be brought to the main memory (job scheduling)

12. Process Scheduling Queues

Job queue: set of all processes in the system
Ready queue: set of processes residing in the main memory, ready and waiting for CPU
Device queue: set of processes waiting for an I/O device
Process migration is possible among these queues

13. Schedulers

The processes may be first spooled to a mass-storage system, where they are kept for later execution
The long-term scheduler (or job scheduler)

selects processes from this pool and loads them into memory for execution
the long term scheduler, if it exists, will control the degree of multiplexing

The short-term scheduler (or CPU scheduler)

selects from among ready processes, and allocates the CPU to one of them
unlike the long-term scheduler, the short-term scheduler is invoked very frequently

14. CPU and I/O burts

CPU-I/O burst cycle

process execution consist of a cycle of CPU execution and I/O wait

I/O bound process

sends more time doing I/O than computations, many short CPU bursts

CPU-bound process

sends more time doing computation, few very long CPU bursts

Instruction Execution

1. Instruction Execution

While executing a program, the CPU

fetches the next instruction from memory (loading into IR)
decodes it to determine its type and operands
executes it

May take multiple clock cycles to execute an instruction
Example:

LOAD R1, #3
LOAD R2, M2
STORE M3, R4
ADD R1, R2, R3

Each CPU has a specific set of instructions that it can execute (instruction-set architecture)

2. Registers

General registers (data/address)
Program Counter (PC): contains the memory address of the next instruction to be fetched
Stack Pointer (PC): points to the top of current stack in memory. T

the stack contains one frame for each procedure that has been entered but not yet exited

Program Status Word (PSW): contains the condition code bits and various other control bits.
Note: when time multiplexing the CPU, the operating system will often stop the running program to (re)start another one. In these cases, it must save the "state information" (e.g., value of the registers)

3. Computer-System Operation

I/O devices and CPU can execute concurrency
Each device controller has local buffer(s).
CPU moves data from/to main memory to/from local buffers.
I/O is from/to device to/from local buffer of controller
The device driver is special operating software that interacts with the device controller
Typically, the device controller informs CPU that it has finished its operation by causing an interrupt.

4. Instruction Cycle with Interrupts

5. Classes of Interrupts

I/O interrupts: generated by an I/O controller, to signal normal completion of an operation or to signal a variety of error conditions
Timer Interrupts: generated by a timer within the processor. This allows the operating system to perform certain functions on a regular basis.
Hardware Failure Interrupts: generated by a failure (e.g., power failure or memory parity error).
Traps (Software Interrupts): generated by some condition that occurs as a result of an instruction execution

error
user request for an operating system service

6. Interruption Mechanism

Interrupt transfers control to the interrupt service routine generally through the interrupt vector (e.g., Interl) which contains the addresses of all the service routines. (Alternatively, the machine has a status register or cause register that holds the reason for the interrupt - MIPS architecture)
Interrupt Service Routines (ISRs): separate segments of code determine what action should be taken for each type of interrupt
Once the interrupt has been serviced by the ISR, the control is returned to the interrupted program. Need to save the "process state" (register, PC, ...) before ISR takes over.

7. Basic Interrupt Processing

the interrupt is issued
processor finishes execution of current instruction
processor signals acknowledgement of interrupt
processor pushes PSW and PC onto control stack
processor loads new PC value through the interrupt vector
ISR saves remainder of the process state information
ISR executes
ISR restores process state information
Old PSW and PC values are restored from the control stack

8. I/O Structure

After I/O starts, control returns to user program only upon I/O completion

wait instruction idles the CPU until the next interrupt
wait loop (contention for memory access)
at most one I/O request is outstanding at a time, no simultaneous I/O processing

After I/O starts, control returns to user program without waiting for I/O completion

system call: request to the operating system to allow user to wait for I/O completion
device-status table contains entry for each I/O device indicating its type address, and state
operating system indexes into I/O devices table to determine device status and to modify table entry to include interrupt

9. Dual-Mode Operation

Operating system must protect itself and all other programs (and their data) from any malfunctioning program
Provide hardware support to differentiate between at least two modes of operations

user mode: execution done on behalf of a user
kernel mode: (also monitor mode or system mode): execution done on behalf of operating system.

Mode bit added to computer hardware to indicate the current mode

kernel: 0
user: 1

When an interrupt occurs hardware switches to kernel mode
Privileged instructions can be issued only in kernel mode

10. Transition form user to kernel mode

Introduction to Operating Systems

1. What is an Operating System

A program that acts as an intermediary between the user of a computer and the computer hardware
Operating system goals

Convenience: make the computer system convenient to use
Efficiency: Manage the computer system resources in an efficient manner

"The one program running at all times on the computer" is the kernel. Everything else is either a system program or an application program.

2. OS Features needed for multiprogramming

Job Scheduling: must choose the processes that will be brought to memory
Memory Management: must allocate the memory to several jobs
CPU Scheduling: must choose among several jobs ready to run

3. Parallel System

Multiprocessor systems with more than one CPU in close communication
Tightly coupled system

processors share memory and a clock;
communication usually takes place through the shared memory

Advantage of parallel system

increased throughput
economy of scale
increased reliability

4. Distributed Systems

Distributed the computation among several physical processors
Loosely coupled system

each processor has its own local memory
processors communicate with one another through various communication lines

Advantage of distributed systems

resource and load sharing
reliability
communications

Thursday, December 11, 2014

Average Age of Renewal Process

1. Definition

Let $S_n$ be the n-th renewal
Then $S_{N(t)}$ = time of the last renewal (prior to time t)
Let $A(t) = t - S_{N(t)}$

$A(t)$ is called the age of the renewal process
Interpretation: A(t) is the time since last renewal

2. Average Age of Renewal Process

What is $R_n$? $R_n = \int^{S_n}_{S_{n-1}} A(t) dt $
$R_n$ is the area under the curve for one cycle
so $E(R_n)) = E[(X_n)^2/2] = E[X^2_n]/2$
Thus the long-run average of the process is

$\frac{E[R_n]}{E[X_n]} = \frac{E[X^2_n]}{2E[X_n]}$