Effective Data Organization Strategies for Research Projects

Slide Note
Embed
Share

Explore key considerations for organizing data in research projects, including storage options, file structures, access needs, and data size implications. Learn about byte values, storage capacities, and practical tips shared by Jim Graham from Humboldt State University.


Uploaded on Sep 20, 2024 | 0 Views


Download Presentation

Please find below an Image/Link to download the presentation.

The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. Download presentation by click this link. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.

E N D

Presentation Transcript


  1. GSP 510 Organizing Data Jim Graham, Humboldt State University

  2. Organization Options Questions: How much? What is the time frame? Who will need access? How complex? Options: File storage: Local or remote Relational database: Sometimes required for complex data Jim Graham, Humboldt State University

  3. Definitions A byte is one value from 0 to 255 A double floating-point value takes 8 bytes 1 letter of text is 1 to 4 bytes (based on language) 1 Megabyte is about 1 million bytes 1 Gigabyte is about 1 billion bytes 1 Terabyte is about 1 trillion bytes Jim Graham, Humboldt State University

  4. How much? Moving large datasets takes time It is faster to physically mail a hard drive with a 10-terabyte dataset than the send it over the Internet! Eel (Wiyot) River LiDAR data with processing is over 10 Terabytes A few shapefiles can be less than a megabyte Jim Graham, Humboldt State University

  5. Where is access needed? Local access, internal drive is the fastest Used to be expensive USB is a little slower Network drive can be USB speed or faster Definitely expensive Cloud is definitely slower Cost? Cost is a moving target Jim Graham, Humboldt State University

  6. At CPH Best to have a local computer for the speed Google Drive is good for backups and transferring data Has corrupted files External drives are also good for backups and transferring Network drives are faster but only available within CPH s network Jim Graham, Humboldt State University

  7. Hierarchical Storage Internal Drive External Drive Network Drive Internet ( The Cloud ) Jim Graham, Humboldt State University

  8. Folder Structure Graham Lab Papers General Data Projects 2023 Stream Depletion Study Eureka Art Map Mono Lake Papers and reports (we are writing) Data 2023 Proposals and Contracts Jim Graham, Humboldt State University

  9. Tips With ArcGIS, we typically create a new dataset on each tool/transform. Each time we re-execute a tool/transform we get another set of files. With shapefiles being 4-10 files, this creates a huge number of files very quickly. Give file good names: CPH_3inch.tif CPH _3inch_cropped.tif CPH _3inch_cropped_UTM.tif Number duplicates: Crop1.tif, Crop2.tif Jim Graham, Humboldt State University

  10. Local Organzation 1_Originals: Initial acquired data Typically, not shared or backed up 2_Working: Local working folder Typically, not shared or backed up 3_Final: Final datasets and maps Definitely, shared and backed up Jim Graham, Humboldt State University

  11. Tips Don t throw anything away Move it to an _old folder Zip it and back it up to GoogleDrive Add ReadMe.txt files to explain contents of folders Back everything up to two physical locations Avoid viruses Jim Graham, Humboldt State University

  12. Geospatial Data Organization How to break down? By theme: Infrastructure Plants Grasses Trees Animals By time frame: 2020, 2021, 2022, 2023? By area: Lee Vining Creek, Bull Creek, etc. Other? Jim Graham, Humboldt State University

  13. Viruses Stealing information: Rare but could be costly Corrupting your computer: Rare, worse case is reformatted and reinstall from backups Stolen equipment: Happens, backup to the cloud Hardware failures and loses: Happens Ransomware: On the rise, expensive Adware: Annoying but not a disaster Fishing scams: Common Jim Graham, Humboldt State University

  14. Avoiding Issues When on the web, make sure the domain name is for the company you expect. Call if unsure. Don t send passwords through email, call Don t open suspect emails Jim Graham, Humboldt State University

  15. Avoid Losing Data Good organization and some documentation When working as a group, have place to exchange data quickly and a more formal shared drive for long term access and storage Have a written protocol for adding and remove data Revisit shared drives regularly and review their contents Be careful installing applications and opening emails! And protect your passwords! Jim Graham, Humboldt State University

Related