Evolution of Data Abstraction in Operating Systems

Slide Note
Embed
Share

Explore the evolution of central data abstraction in operating systems from the complexity of Multics to the simplicity and elegance of Unix. Discover how files are managed in Unix, the tradeoffs in data-sharing methods, and the impacts on efficiency and protection.


Uploaded on Oct 03, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. The UNIX Time-Sharing System Landon Cox Landon Cox February 1, 2017 February 1, 2017

  2. Multics Multi Multi- -user operating system user operating system Primary goal was to allow efficient, safe sharing btw users Central data abstraction in Central data abstraction in Multics A segment segment All All data was contained within a segment No distinction between files and memory Accessed through loads/stores in memory Think of a segment as an mmapped region of memory Multics

  3. Unix Also a multi Also a multi- -user operating system user operating system In many ways a response to the complexity of Multics Primary goals were simplicity, elegance, and ease of use What is the central data abstraction in Unix? What is the central data abstraction in Unix? A file file As in Multics, hierarchical namespace Mapped human-readable names to data objects Three kinds of files Three kinds of files Ordinary files Directories Special files

  4. Files in Unix How are files read and written? How are files read and written? Via explicit read/write system calls Requires passing a buffer between process, kernel In what way is this better than Multics segments? In what way is this better than Multics segments? Much narrower interface Don t have to worry about stray loads/stores Clean separation of ephemeral and persistent state What is the downside compared to segments? What is the downside compared to segments? Requires extra copying Kernel makes copy copy of a buffer in its own address spaces

  5. Data-sharing tradeoffs One copy of shared data One copy of shared data Only copy reference Only copy reference Changes to copies are global Changes to copies are global Corruption visible to all Corruption visible to all Share by reference Efficiency Efficiency Spend time creating copies Spend time creating copies Spend memory holding copies Spend memory holding copies Changes to copies are local Changes to copies are local Corruption can be contained Corruption can be contained Share by value Protection Protection

  6. Data-sharing tradeoffs How to share by reference, value? How to share by reference, value? int P(int a){ } void C(int x){ int y=P(x); } Share by reference Efficiency Efficiency Share by value Protection Protection

  7. Data-sharing tradeoffs What was the default sharing mode for What was the default sharing mode for Multics Share by reference (via segments) Share by reference (via segments) Multics? ? Share by reference Efficiency Efficiency Share by value Protection Protection

  8. Data-sharing tradeoffs Unix s approach is very different Unix s approach is very different By default, share by value; By default, share by value; Support share by reference when needed Support share by reference when needed Share by reference Efficiency Efficiency Share by value Protection Protection

  9. UNIX philosophy OS by programmers for programmers OS by programmers for programmers Support high-level languages (C and scripting) Make interactivity a first-order concern (via shell) Allow rapid prototyping How should you program for a UNIX system? How should you program for a UNIX system? Write programs with limited features Do one thing and do it well Support easy composition of programs Make data easy to understand Store data in plaintext (not binary formats) Communicate via text streams Thompson and Ritchie Thompson and Ritchie Turing Award 83

  10. UNIX philosophy Kernel Kernel ProcessC ProcessC ? ? ProcessP ProcessP What is the core abstraction? What is the core abstraction? Communication via files files

  11. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP What is the interface? What is the interface? Open Open: get a file reference (descriptor) Read/Write Read/Write: get/put data Close Close: stop communicating

  12. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Why is this safer than Why is this safer than procedure calls? procedure calls? Interface is narrower Access file in a few well-defined ways Kernel ensures things run smoothly narrower

  13. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP How do we transfer How do we transfer control to kernel? control to kernel? system call system call instruction (software trap) CPU pauses process, runs kernel Kernel schedules other process

  14. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Key insight: Key insight: Interface can be used for lots of things Persistent storage (i.e., real files) Devices, temporary channels (i.e., pipes)

  15. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Two questions Two questions (1) How do processes start running? (2) How do we control access to files?

  16. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Two questions Two questions (1) (1) How do processes start running? How do processes start running?

  17. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Maybe P is already running? Maybe P is already running?

  18. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP What might we call such a process? What might we call such a process? Basically what a server A process C wants to talk to process someone else launched server is

  19. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP All processes shouldn t be servers All processes shouldn t be servers Want to launch processes on demand C needs primitives to create P

  20. UNIX shell Kernel Kernel Shell Shell Program that runs other programs Program that runs other programs Interactive (accepts user commands) Essentially just a line interpreter Allows easy composition of programs

  21. UNIX shell How does a UNIX process interact with a user? How does a UNIX process interact with a user? Via standard in (fd 0) and standard out (fd 1) These are the default input and output for a program Establishes well-known data entry and exit points for a program How do UNIX processes communicate with each other? How do UNIX processes communicate with each other? Mostly communicate with each other via pipes Pipes allow programs to be chained together Shell and OS can connect one process s stdout to another s stdin pipes Why do we need pipes when we have files? Why do we need pipes when we have files? Pipes create unnamed temporary Communication between programs is often ephemeral OS knows to garbage collect resources associated with pipe on exit Consistent with UNIX philosophy of simplifying programmers lives temporary buffers between processes

  22. UNIX shell Pipes simplify naming Pipes simplify naming Program always receives input on fd 0 Program always emits output on fd 1 Program doesn t care what is on the other end of fd Shell/OS handle input/output connections How do pipes simplify synchronization? How do pipes simplify synchronization? Pipe accessed via read system call Read can block in kernel until data is ready Or can poll, checking to see if read returns enough data File descriptor demo File descriptor demo

  23. How kernel starts a process 1. 1. Allocates Allocates process control process control block (bookkeeping data structure) block (bookkeeping data structure) 2. 2. Reads Reads program code from disk program code from disk 3. 3. Stores Stores program code in memory program code in memory (could be demand (could be demand- -loaded too) loaded too) 4. 4. Initializes Initializes machine registers for new process machine registers for new process 5. 5. Initializes Initializes translator data for new address space translator data for new address space E.g., page table and PTBR Virtual addresses of code segment point to correct physical locations 6. 6. Sets Sets processor mode bit to user processor mode bit to user 7. 7. Jumps Jumps to start of program to start of program

  24. Creating processes Through what commands does UNIX create processes? Through what commands does UNIX create processes? Fork: create copy child process Exec: initialize address space with new program What s the problem of creating an exact copy process? What s the problem of creating an exact copy process? Child needs to do something different than parent i.e., child needs to know that it is the child How does child know it is child? How does child know it is child? Pass in return point Parent returns from fork call, child jumps into other region of code Fork works slightly differently now

  25. Fork Child can t Child can t be an exact copy be an exact copy Is distinguished by one variable (the return value of fork) if (fork () == 0) { /* child */ execute new program } else { /* parent */ carry on }

  26. Creating processes Why make a complete copy of parent? Why make a complete copy of parent? Sometimes you want a copy of the parent Separating fork/exec provides flexibility Allows child to inherit some kernel state E.g., open files, stdin, stdout Very useful for shell How How do we efficiently copy an address space? do we efficiently copy an address space? Use copy on write (COW) Make copy of page table, set pages to read-only Only make physical copies of pages on write fault

  27. Copy on write Physical Physical memory memory Parent Parent memory memory Child Child memory memory What happens if parent writes to a page? What happens if parent writes to a page?

  28. Copy on write Physical Physical memory memory Parent Parent memory memory Child Child memory memory Have to create a copy of pre Have to create a copy of pre- -write page for the child. write page for the child.

  29. Alternative approach Windows Windows CreateProcess CreateProcess Combines the work of fork and exec UNIX s UNIX s approach approach Supports arbitrary sharing between parent and child Window s Window s approach Supports sharing of most common data via params approach

  30. Shells (bash, explorer, finder) Shells are normal programs Shells are normal programs Though they look like part of the OS How How would you write one? would you write one? while (1) { print prompt ( crocus% ) ask for input (cin) // e.g., ls /tmp first word of input is command // e.g., ls fork a copy of the current process (shell) if (child) { redirect output to a file if requested (or a pipe) exec new program (e.g., with argument /tmp ) } else { wait for child to finish or can run child in background and ask for another command } }

  31. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Two questions Two questions (1) How do processes start running? (2) How do we control access to files?

  32. UNIX philosophy Kernel Kernel File File ProcessC ProcessC ProcessP ProcessP Two questions Two questions (1) How do processes start running? (2) (2) How do we control access to files? How do we control access to files?

  33. Access control Where is most trusted code located? Where is most trusted code located? In the operating system kernel What are the primary responsibilities of a UNIX kernel? What are the primary responsibilities of a UNIX kernel? Managing the file system Launching/scheduling processes Managing memory How do processes invoke the kernel? How do processes invoke the kernel? Via system calls system calls Hardware shepherds transition from user process to kernel Processor knows when it is running kernel code Represents this through protection rings or mode bit

  34. Access control How does kernel know if system call is allowed? How does kernel know if system call is allowed? Looks at user id (uid) of process making the call Looks at resources accessed by call (e.g., file or pipe) Checks access-control policy associated with resource Decides if policy allows uid to access resources How is a How is a uid On fork, child inherits parent s uid uid normally assigned to a process? normally assigned to a process?

  35. MOO accounting problem Multi Multi- -player game called Moo player game called Moo Want to maintain high score in a file Game client (uid x) x sscore = 10 Should players be able to update score? Should players be able to update score? Yes High score Do we trust users to write file directly? Do we trust users to write file directly? No, they could lie about their score y sscore = 11 Game client (uid y)

  36. MOO accounting problem Multi Multi- -player game called Moo player game called Moo Want to maintain high score in a file Game client (uid x) x sscore = 10 High score Game server x:10 y:11 y sscore = 11 Could have a trusted process update scores Could have a trusted process update scores Is this good enough? Is this good enough? Game client (uid y)

  37. MOO accounting problem Multi Multi- -player game called Moo player game called Moo Want to maintain high score in a file Game client (uid x) x sscore = 100 High score Game server x:100 y:11 Could have a trusted process update scores Could have a trusted process update scores Is this good enough? Is this good enough? Can t be sure that reported score is genuine Need to ensure score was computed y sscore = 11 Game client (uid y) computed correctly

  38. Access control Sometimes simple inheritance of Sometimes simple inheritance of uids Tasks involving management of user id state Logging in (login) Changing passwords (passwd) Where have we put management code before? Where have we put management code before? Put it in the kernel (e.g., file system and page table code) Why not put login, Why not put login, passwd passwd, etc inside the kernel? , etc inside the kernel? This functionality doesn t really require interaction w/ hardware Would like to keep kernel as small as possible How are trusted user How are trusted user- -space processes identified? space processes identified? Run as super user super user or root root (uid 0) Like a software kernel mode If a process runs under uid 0, then it has more privileges uids is insufficient is insufficient

  39. Access control Why does login need to run as root? Why does login need to run as root? Needs to check username/password correctness Needs to fork/exec process under another uid Why does Why does passwd passwd need to run as root? need to run as root? Needs to modify password database (file) Database is shared by all users What makes What makes passwd passwd particularly tricky? particularly tricky? Easy to allow process to shed privileges (e.g., login) passwd requires an escalation escalation of privileges How does UNIX handle this? How does UNIX handle this? Executable files can have their setuid If setuid bit is set, process inherits uid of image file s owner setuid bit set owner on exec

  40. MOO accounting problem Multi Multi- -player game called Moo player game called Moo Want to maintain high score in a file fork/exec game Game client (uid moo) Shell (uid x) How does How does setuid Game executable is owned by trusted entity Game cannot be modified by normal users Users can run executable though High-score is also owned by trusted entity setuid solve our problem? solve our problem? x sscore = 10 High score (uid moo) This is a form of trustworthy computing This is a form of trustworthy computing Only trusted code can update score Root ownership ensures code integrity Untrusted users can invoke trusted code y sscore = 11 fork/exec game Game client (uid moo) Shell (uid y)

  41. Summary of UNIX Share Share- -by Everything looks like a file Standardize interface (open, read/write, close) Standardize entry/exit points (stdin, stdout) Read in copy, work on copy, copy out results by- -copy is easier for programmers copy is easier for programmers Try to make share Try to make share- -by Use copy-on-write whenever possible by- -copy more efficient copy more efficient Next time Next time Sharing across machines (RPC, code offload)

Related