Online Data Carpentry Workshop
University of Western Cape, sponsored by SADiLaR, South Africa
Instructors: Benson Muite, Oghenere Salubi, Kudakwashe Madzima, Sarah Schäfer
本文地址 点击关注微信公众号 '程序员的文娱情怀' 分享提纲: 1. Mac版实现ssh登录,显示图形化 分享提纲: 1. XQuartz 2.8.0 - 2021-03-21 - First release with native Apple Silicon support XQuartz 2.7.11 - 2016-10-29 - Last release to support Snow Leopard, Lion, and Mountain Lion XQuartz 2.7.10 - 2016-10-22. MacOS Big Sur (version 11) is the 17th and current major release of macOS, Apple Inc.' S operating system for Macintosh computers, and is the successor to macOS Catalina (version 10.15). It was announced at Apple's Worldwide Developers Conference (WWDC) on June 22, 2020, 7 and was released to the public on November 12, 2020. MacOS If you already have R and RStudio installed. Open RStudio, and click on “Help” “Check for updates”. If a new version is available, quit RStudio, and download the latest version for RStudio. To check the version of R you are using, start RStudio and the first thing that appears on the terminal indicates the version of R you are. Brew cask install xquartz brew install poppler antiword unrtf tesseract swig pip install textract Note pstotext is not currently a part of homebrew so.ps extraction must be enabled by manually installing from source.
Helpers: Freddy Izingizwe, Ibrahim Ahmed
General Information
Data Carpentry develops and teaches workshops on the fundamental data skills needed to conduct research. Its target audience is researchers who have little to no prior computational experience, and its lessons are domain specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.
For more information on what we teach and why, please see our paper 'Good Enough Practices for Scientific Computing'.
Who: The course is aimed at graduate students and other researchers. You don't need to have any previous knowledge of the tools that will be presented at the workshop.
Where: This is an online event. We will meet using the online videoconference software Zoom. You will need to download and install their client to connect with your instructors. The link to use for this event is https://carpentries.zoom.us/my/carpentriesroom3. If needed, the password is **202020**.
When: 12 April - 16 April, 2021. Add to your Google Calendar.
Requirements: Be a postgraduate student in the Humanities and Social Sciences. Participants must have access to a laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that they have administrative privileges on. Some sponsorship is available to access mobile data - please contact the organisers for more information. They should have a few specific software packages installed. Since this is an online workshop, participants need to have access to internet (listed below).
Accessibility: We are committed to making this workshop accessible as far as possible. Please make sure that you have the following:
- Laptop with Administration privileges.
- Internet access for the duration of the workshop.
Materials will be provided in advance of the workshop and you will be able to ask the organisers for extra materials if needed in advance. If we can help making learning easier for you.
Contact: Please email sschafer@uwc.ac.za for more information.
Code of Conduct
Everyone who participates in Carpentries activities is required to conform to the Code of Conduct.This document also outlines how to report an incident if needed.
Collaborative Notes
We will use this collaborative document for chatting, taking notes, and sharing URLs and bits of code.
Surveys
Please be sure to complete these surveys before and after the workshop.
Schedule
Day 1
08:30 | Pre-workshop survey |
09:00 | Data Organization in Spreadsheets |
12:00 | Data Cleaning with OpenRefine |
13:00 | END |
Day 2
8:30 | Data Cleaning with OpenRefine |
10:00 | Data Analysis and Visualisation with R |
13:00 | END |
Day 3
08:30 | Data Analysis and Visualisation with R |
13:00 | END |
Day 4
08:30 | Data Analysis and Visualisation with R |
13:00 | END |
Day 5
08:30 | Data Analysis and Visualisation with R |
13:00 | Post-workshop survey |
13:15 | END |
Syllabus
Data Organization in Spreadsheets
Data Cleaning with OpenRefine
Introduction to R
Setup
To participate in a Data Carpentry workshop, you will need access to the software described below. In addition, you will need an up-to-date web browser.
We maintain a list of common issues that occur during installation as a reference for instructors that may be useful on the Configuration Problems and Solutions wiki page.
The setup instructions for the Data Carpentry Social Sciences workshops (with R) can be found at the workshop overview site.
Additional R setup information is below.
1: R and RStudio
- R and RStudio are separate downloads and installations. R is theunderlying statistical computing environment, but using R alone is nofun. RStudio is a graphical integrated development environment (IDE) that makesusing R much easier and more interactive. You need to install R before youinstall RStudio. After installing both programs, you will need to installsome specific R packages within RStudio. Follow the instructions below foryour operating system, and then follow the instructions to install
tidyverse
.
For this version we recommend at least R version 4.0 or later and RStudio 1.2
Windows
If you already have R and RStudio installed
- Open RStudio, and click on “Help” > “Check for updates”. If a new version isavailable, quit RStudio, and download the latest version for RStudio.
- To check which version of R you are using, start RStudio and the first thing that appears in the console indicates the version of R you are running. Alternatively, you can type
sessionInfo()
, which will also display which version of R you are running. Go on the CRAN website and checkwhether a more recent version is available. If so, please download and installit. You can check here formore information on how to remove old versions from your system if you wish to do so.
If you don’t have R and RStudio installed
- Download R from the CRAN website.
- Run the
.exe
file that was just downloaded - Go to the RStudio download page
- Under Installers select RStudio x.yy.zzz - Windows Vista/7/8/10 (where x, y, and z represent version numbers)
- Double click the file to install it
- Once it’s installed, open RStudio to make sure it works and you don’t get anyerror messages.
macOS
If you already have R and RStudio installed
- Open RStudio, and click on “Help” > “Check for updates”. If a new version is available, quit RStudio, and download the latest version for RStudio.
- To check the version of R you are using, start RStudio and the first thingthat appears on the terminal indicates the version of R you are running. Alternatively, you can type
sessionInfo()
, which will also display which version of R you are running. Go onthe CRAN website and checkwhether a more recent version is available. If so, please download and installit. In any case, make sure you have at least R 4.0.
- To check the version of R you are using, start RStudio and the first thingthat appears on the terminal indicates the version of R you are running. Alternatively, you can type
If you don’t have R and RStudio installed
- Download R fromthe CRAN website.
- Select the
.pkg
file for the latest R version - Double click on the downloaded file to install R
- It is also a good idea to install XQuartz (neededby some packages)
- Go to the RStudio download page
- Under Installers select RStudio x.yy.zzz - Mac OS X 10.6+ (64-bit)(where x, y, and z represent version numbers)
- Double click the file to install RStudio
- Once it’s installed, open RStudio to make sure it works and you don’t get anyerror messages.
Linux
Follow the instructions for your distribution
from CRAN, they provide informationto get the most recent version of R for common distributions. For mostdistributions, you could use your package manager (e.g., for Debian/Ubuntu runsudo apt-get install r-base
, and for Fedora sudo yum install R
), but wedon’t recommend this approach as the versions provided by this areusually out of date. In any case, make sure you have at least R 3.2.
- Go to the RStudio download page
- Under Installers select the version that matches your distribution, andinstall it with your preferred method (e.g., with Debian/Ubuntu
sudo dpkg -irstudio-x.yy.zzz-amd64.deb
at the terminal). - Once it’s installed, open RStudio to make sure it works and you don’t get anyerror messages.
2: Install the Tidyverse
- After installing R and RStudio, you need to install the
tidyverse
packages. Start RStudio by double-clicking the icon and then type:install.packages('tidyverse')
. You can also do this by going to Tools -> Install Packages andtyping the names of the package you want to install. You will see that the name auto-completes. Make sure the ‘inclde dependencies’ box is checked.
3: Test your installation
Xquartz Macosforge
- Download this test R script
- Open R Studio and use the File menu to open the script. (This script will load the Tidyverse and a package of data that is build in to base R.)
- By default, the script pane will open on the upper left. Select all 9 lines of the script.
- Click ‘Run the current line or selection’ button
- You should see a graph that looks like the illustration in the lower right pane