Exploring Data Science at the Command Line with UNIX and Vim

Slide Note
Embed
Share

Delve into the world of data science through the command line, UNIX, and Vim, which offer agile, filesystem-integrated, scalable, and extensible solutions. Discover the significance of the command line, its integration with other technologies, and the role it plays in supercomputing and remote computing. Uncover the realms of command line basics, navigation, quick data access, file creation and editing, and Vim modes, all essential for efficient data handling and analysis.


Uploaded on Oct 08, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. Data Science at the Command Line: Unix, Vim, and Supercomputing Paul Bodily Computer Science

  2. Why Command Line? Agile REPL Close to the filesystem Integrates well with other technologies Scalable and repeatable Extensible Ubiquitous Linux, Mac OS X Supercomputers, servers, laptops Linux skills are in high demand

  3. The REAL Why Command Line Games: tetris pong snake solitaire gomoku (Connect 4) 5x5 telnet towel.blinkenlights.nl dunnet (text based adventure game) landmark doctor

  4. Vim vs Emacs

  5. Supercomputing Data Science Big Data Remote computing Typically Linux In high demand, good jobs

  6. Command Line Basics

  7. Where is the Command Line? It runs like a program Terminal iTerm Windows VM Install Linux like a Windows Software

  8. Command Line Navigation pwd root vs home dir ls cd mkdir mv (-r) rm (-r) (PERMANENT)

  9. Command Line Navigation history ctl-r ctl-a, -e, -k echo !!, ?!, * $? (exit status of previous command)

  10. Command Line Quick Data Access cat head tail less wc ctl-c

  11. File Creation, Editing, and Security

  12. File Creation and editing touch vi emacs nano

  13. Vim Modes Modes Normal/command mode (esc) <- default Write, quit, search, copy, paste, fast navigation Insert mode ( i ) Visual mode ( v ) Easy text selection

  14. Vim Navigation https://www.maketecheasier. com/vim-keyboard-shortcuts- cheatsheet/ G gg 25gg (go to line 25) Writing Quitting yy, p (copy, paste) Shift p u 100yy, 2p dd (delete line) Shift a, shift i Search Record Macros Syntax highlighting Search/replace Split window

  15. File Security chmod [ugoa][+-=][rwx] filename(s) Recursive option Affects visibility of files on a webserver

  16. Command Line: Beyond the Basics

  17. Command Line Data Direction/Manipulation pipe, redirects diff cut sort uniq join grep awk (field processing) python

  18. Basketball Example Idaho State University vs University of Idaho Keep record of all points scored Name, team, number, points, quarter Q: Print a roster Q: How many 2-pointers did Court score? Q: Change Ferdi to Ferdy ? Q: Which players scored a 3-pointer? Q: What was the total points scored by ISU as 3-pointers?

  19. Take-home Use existing tools, don t write new code

  20. Command Line Tools/Programs man path which bashrc

  21. Bash Scripting A collection of command line commands that will be executed sequentially Has extension .sh Has variables, loops, conditionals Executed by running bash <filename.sh> OR by adding path to program as first line in the file

  22. Remote Computing ssh Config file scp wget exit

  23. Supercomputing Interactive vs non-interactive nodes

  24. Supercomputers at ISU Thorshammer (research) 8 nodes 144 physical cores (288 with hyper threading) 768 GBs RAM, ~80TB storage Minerve 9 nodes 72 cores (144 with hyperthreading) 288 GBs of RAM

  25. Supercomputers at ISU https://help.cose.isu.edu/how-to/hpcc Request an account SSH guide Torque Job Scheduler Current Software Contact Information

  26. Job Submission Script Notification preferences Contact email Resources request Output/Error destination Runs mostly like a regular bash script

  27. Torque Job Scheduling qsub qstat qdel watch man

  28. Miscellaneous

  29. Unix Package Installers Homebrew Macports apt-get yum Fink pip

Related