Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feedback from Instructors on pilot workshops #31

Open
ErinBecker opened this issue Sep 24, 2018 · 7 comments
Open

Feedback from Instructors on pilot workshops #31

ErinBecker opened this issue Sep 24, 2018 · 7 comments
Labels
type:discussion Discussion or feedback about the lesson

Comments

@ErinBecker
Copy link
Contributor

This issue is meant to collect feedback from Instructors (and helpers, etc) running pilot workshops. Leave open through December 2018.

Instructors, please add any comments you have about how your pilot workshop went. In particular:

  • What type of an audience did you have? What was their background and skill level?
  • Which (if any) of the two intro lessons did you teach (Intro to Geospatial Concepts, Intro to R for Geospatial Data)? If you taught both, which order did you teach them in?
  • How long did it take to teach each lesson? Were there any parts you had to leave out? Please be specific about what you left out.
  • If you left out material, how did it seem to affect your learners? Was it ok to remove that material? Did you have to come back to it later? Was it confusing to skip certain concepts?
  • What problems did your learners have with the installation? What solutions did you find? If you are comfortable doing so, please also put this information into the Instructor Notes.
  • Do you have any specific tips for other Instructors teaching these lessons? If you are comfortable doing so, please also put this information into the Instructor Notes.

You don't need to answer all of the questions above! Please share any and all information that you think will be helpful for future Instructors of these lessons. I will be reading through this issue before the Curriculum Advisory Committee meeting in November and will raise issues to that committee for discussion. Please also feel free to leave specific issues on the individual lessons. The Maintainers and Curriculum Advisors really appreciate your feedback!

@ErinBecker ErinBecker added the type:discussion Discussion or feedback about the lesson label Sep 24, 2018
@ErinBecker
Copy link
Contributor Author

These notes are from Michael Culshaw-Maurer from a pilot workshop in August. Some of the curriculum rearrangements might have already addressed some of these comments, but I will include the full set here for reference.

Thoughts After Teaching 1st DC Geospatial Workshop

Intro to R

Overall Thoughts

  • I had to introduce tons of stuff on the fly, mostly due to ordering of chapters
  • Order needs work- data.frames are introduced before vectors, there are a lot of prerequisites that come too late
  • I found myself saying the phrases "we'll cover this more in a little bit" and "alright, we're gonna look at this again" far too often
  • The very first code actually written in R is creating a data.frame from scratch, then writing it to a .csv, then subsetting it with "$". All of these things are left unexplained, but are fully covered later on. You have to either leave people confused, or leave them with a lot of redundant info on data.frames, subsetting, and writing data later on.
  • The chapter is too long in general, and it seemed to overwhelm many people
  • It needs to be decided whether this chapter is more in service of the geospatial chapter or should serve as a standalone introduction to R
    • In that sense, do we want people to come out of this workshop with general ability in R (including non-spatial data) or get them working with spatial data as quickly as possible?
  • References to Nano and command line, neither of which need to be used and simply caused confusion

Fine-Scale Points

Intro to R/RStudio

  • Never introduces comments
  • Never uses the word object
  • Order of operations seemed like a bit too much detail

Data Structures

  • data.frame() not explained
  • c() not explained
  • write.csv() not explained
  • creating a fake data.frame frustrated lots of people and didn't seem to aid understanding
    • I think this could be different if building up from simple vectors first, which is a general issue I have with the chapter ordering
  • characters vs. numbers not explained
  • functions not explained
  • paste() not explained and somewhat pointless
  • the "issue with factors" is brought up, but then the phrase "understanding what happened here is key to successfully analyzing data in R" is not followed by anything that will give that understanding
  • typeof() and class() should be introduced together and compared
  • "A user has added new details" to a file that does not exist, then the "cats" data is mentioned, though it never appears in this chapter
  • then creates something called "nordic_orig" that's never referenced again
  • It's not really clear why it's worth saying that the vector function defaults to "logical", as this gives the idea that "logical" is somehow the core kind of vector. Start off with numeric or character vectors.
  • Also probably not worth starting with empty vectors with a preexisting mode, start with actual vectors with actual things in them
  • This would be a good place to introduce the c() function
  • cats data get mentioned again
  • coercion is good but maybe doesn't need as much of a treatment as it's getting
  • now we introduce data.frames formally, but there's been a lot of confusion up to this point
  • finally we get to factors, which are introduced well, but too late. Should go along with vectors
  • lists are introduced well, but their relationship to data.frames is not quite clear

Exploring Data Frames

  • I really think data.frames should first be introduced here, as an extension of vectors and lists
  • Now explain how data.frames are lists, how each row is a list and each column is a vector. Look at different vector types as part of this data.frame
  • This chapter as a whole is pretty solid, but the ordering of stuff before it really throws it off

Subsetting Data

  • Should be broken up and combined with the previous two chapters
    • There's no reason subsetting should be its own concept, it is fundamentally part of learning about vectors and data.frames
    • We already did subsetting in prior chapters
  • Atomic vectors are just introduced here

Creating Publication-Quality Graphics

  • Good ggplot chapter, but since there's a LOT of base R plotting in the Geospatial portion, I think ggplot feels like a bit of an afterthought
    • See "General Thoughts" for more, but I think the use of tidyverse needs to be reconsidered, should be all-or-nothing

Writing Data

  • Already used write.csv() in Data Structures
  • Don't like the use of the shell in this chapter

Dataframe Manipulation with dplyr

  • Good dplyr chapter, but it's a LOT to get through and doesn't get used at all in Geospatial
    • Again, see "General Thoughts" on tidyverse

Writing Good Software

  • Talks about writing functions, but writing functions is never taught in this whole lesson

Proposed Order of Concepts for Intro to R

  1. R as calculator (plus comments)
  2. Object assignment
  3. Basic vectors (define atomic)
  4. Vector creation with c()
  5. Functions and arguments
  6. Data types and coercion
  7. typeof() and class()
  8. Subsetting vectors
  9. Lists
  10. download.file() and read.csv()
  11. data.frame intro
  12. Rows = lists, columns = vectors
  13. str(), dim(), etc.
  14. Subsetting data.frames
  15. Adding/removing
  16. write.csv()

From here, either a comprehensive tidyverse treatment, or jump to "Writing Good Software". I think you could get through this "Intro R" lesson pretty quickly and even get into some of the first Geospatial chapters on Day 1. I think our Day 1 structure had little to no geospatial payoff, which seemed to leave some people frustrated.

General Thoughts

  • For many folks, the things that stuck the most were "tips and tricks" in R and RStudio. I had people come up and tell me that they'd never used tab-complete before, or they'd never used the "Section Header" functionality in .R files in RStudio
  • Day 1 had very little payoff in terms of geospatial data, and this seemed to frustrate folks. Many of the Carpentries workshops get realistic data into the participant's hands ASAP, but this workshop delays any geospatial data until Day 2.
  • The use of the tidyverse is great, but feels like an afterthought
    • The tidyverse paradigm is wonderful, but it takes some time to explain and wrap your head around. Since it's not used much in the Geospatial sections, I really think some time and energy could be saved by leaving it out
    • If tidyverse is to be included, I think it needs a fuller treatment
    • I think it would be fine to say "while we're not going to be using it much for this Geospatial workshop, there's this series of packages called the tidyverse, and it can be used for everything from data cleaning and manipulation to plotting. Here are some resources to learn more about it _____." As of now, it feels like a conceptual overload
  • The workshop seems to have a bit of an identity crisis. Do we want participants to come out with a solid working knowledge of R in general, and some very basic ideas of working with geospatial data in R? Or do we want students to come out with enough basic R knowledge to support a more in-depth understanding of how to work with geospatial data in R? Most of the Intro R knowledge isn't used in the Geospatial section, so it seems like it could be pared down in service of more time on the Geospatial stuff.
    • This is a super ambitious workshop- most other R workshops cover way less ground, since they don't have to teach about a fundamental data concept. Tabular data is the default for many fields, and that means that realistic data fits into the basic R lessons easily. However, geospatial data is different, so the workshop has to introduce what geospatial data is, in addition to covering basic tabular data in R
    • I really think it can be done, but it requires streamlining the Intro R stuff. You just have to tell people that we're not going into lots of R's more typical capabilities

@chris-prener
Copy link

These are great notes. FWIW I start GIS students in R without covering much of this at all - jump right to making maps with an sf object and then back fill much of these fundamental aspects as we progress through the semester. I think Michael's suggestion about either (a) speeding up our coverage of R fundamentals or (b) starting with spatial data is something we should take really seriously.

Tabular data is the default for many fields, and that means that realistic data fits into the basic R lessons easily. However, geospatial data is different, so the workshop has to introduce what geospatial data is, in addition to covering basic tabular data in R

The beauty of sf objects is that the things that we would think of as "tabular" data operations also work on sf objects - I see less of a difference here than I may have in the past. Just my two cents.

@MCMaurer
Copy link

@chris-prener thanks Chris, I'm glad you found the notes useful! I figured I'd tag myself in here just in case anyone wanted to ask for any clarification on these notes.

@jebyrnes
Copy link

For the next few comments, here is some info from the Geospatial Data Carpentry Class we taught at UMass Boston from January 22-23, 2019. https://jebyrnes.github.io/2019-01-22-UMB/

FYI, code generated is in https://github.com/jebyrnes/r-spatial_dc_workshop_2019-01-22 and if you want to check out our ether pad, it's at https://pad.carpentries.org/2019-01-22-umb-geospatial

To start with, to answer @ErinBecker's questions

  • What type of an audience did you have? What was their background and skill level?
    We had a mix of grad students, a few postdocs and professors, and agency scientists. We also had a few undergrads. People's prior expertise in both R and GIS ranged from none to advanced - although very few in the later category. People were more experienced with GIS than with R.

  • Which (if any) of the two intro lessons did you teach (Intro to Geospatial Concepts, Intro to R for Geospatial Data)? If you taught both, which order did you teach them in?
    We taught both, with the intro to geospatial first (we just followed the order). On reflection, we'd reverse that order - but see also our notes below.

  • How long did it take to teach each lesson? Were there any parts you had to leave out? Please be specific about what you left out.
    It varied. Generally, lessons took ~ double the time listed. We only made it to plotting multiple shapefiles + rasters, and left out multi band rasters. We also ditched a lot of the plot tidying up along the way to cover more conceptual material.

  • If you left out material, how did it seem to affect your learners? Was it ok to remove that material? Did you have to come back to it later? Was it confusing to skip certain concepts?
    It was fine. A few of our learners stayed late after day 2 to go over the time series of rasters lesson, as they had some immediate applications.

  • What problems did your learners have with the installation? What solutions did you find?
    Surprisingly few. There were a few hiccups that our helpers helped to fix, but they were idiosyncratic.

  • Do you have any specific tips for other Instructors teaching these lessons?
    See comment below. In general, there's a LOT of time spent on making plots very pretty. That can be dropped, or covered only once. The intro to R is very detailed in terms of how to manipulate data frames in base R before introducing dplyr. Given that the rest of the lesson uses dplyr, you can probably just drop that. But, see additional comments below. Ditto on factors - the time:use ratio is too skewed currently. Doing 1,2,3 (with reduced factors and lists and shorten type coercion - they're not using it in many of the rest of the lessons), 6, 7 might not be bad in the current lesson.

@jebyrnes
Copy link

Notes on Geospatial Lesson from Workshop at UMB

2019-01-22 and 23

From @jebyrnes, @DrK-Lo, and @KlausVigo

General:

  • Can the R general stuff be integrated more into the geospatial teaching? So, get through the basics of how to use R, and then move into rasters. Introduce ggplot, dplyr, subsetting, etc as a part of working with rasters. Or sf objects instead of data frames, and then end with rasters? Basicaly, we thought tighter integration would make this an overall better workshop, and our learners agreed. So, intro to R would only really go through lesson 3, but in lesson 3, we would include loading both sf and raster objects as two data types and reduce info on factors, lists, and type coercion, which aren't really germane later.

  • Can we move the intro to geospatial to after the intro to R, if they are both going to remain.

  • Each geospatial lesson takes 1 hour, not 30 minutes. If we get through lesson 10, we are lucky.

  • Deciding whether this is a workshop for R or for geospatial, that happens to use R, could clarify things.

Intro to Geospatial

  • Slides should be provided - easy with RMarkdown. Maybe move code chunks into .R files, and use those chunks to generate imagery for slides? Would promote tighter integration.

Intro to R

  • To what extent are all of the rbind, cbind, etc., stuff in Exploring Data Frames necessary?
  • The R piece is too long - is it all needed? Should we just do dplyr for subsetting/manipulations, and use that and tidyr throughout? Might be worth having a few people sit down and really think about what is versus is not needed to get to the geospatial bit
  • Way too much time spent on data frames in detail rather than moving on to rasters.
  • Dplyr section too long? Do they need to know more than filter? And I say this loving dplyr.

Raster Lessons

  • Why use capture output when you can save output of GDALinfo into an object if you need it?
  • Intro lesson took 1 hour
  • Cut out multiband bit from lesson 1. Confusing. We dropped it, and moved the material to the multiband lesson
  • The Find Bad Values bit in lesson 1 seems orphaned, and can be cut
  • Challenge at end of the plot raster data lesson has so many parts. It's enormous.
  • Cut publication quality graphics episode unless it's folded into a more general ggplot episode?
  • Make the ggplot code in the lessons and exercises less detailed, as every line requires more cognitive load
  • Maybe even ditch multiband rasters alltogether and/or make it part of temporal rasters at the end?
  • Overlay is great, but.... is this the right place to teach about how to write a function?

@jebyrnes
Copy link

One up, One Downs from UMB Geospatial Data Carpentry Workshop

2019-01-22 and 23

I'm including the one up one downs as they say a lot about how our learners preceived the class. I'll also italicize a few to highlight them.

Day 1

One Up
++++Shortcuts to make our lives easier.
Everyone helping out was extremely patient with me
intro to R had many clear examples of structure of code. And shortcuts!!!!! Great pace++
+++frendly and helpfull staff
the assistants were really dynamic and helpful - they were invaluable to understanding the lessons
++ Having challenges and making sufre everyone was able to accomplish and understand
I appreciated the help from the assistants and patience as they helped me with my learning curve. +
Etherpad Superhelpful!+
+having all of the materials and examples organized on the website, allowing me to reference them and follow along ++
highlighting good coding practices++
Great use of practical examples to get an adequate understanding of how to use this on my own outside of the workshop
+Exercises were really helpful! +
organized layout/flow of intro to R and geospatial technical termology +
Live assistant was the most helpful. Thanks for being patient w me!!
having downloaded everything prior streamlined the course
+Good pace; and I even learned a few new things
I like the carpentry green and red stickers
good reintroduction to data and file organization, which is especially helpful with spatial data!
good step-wise rundown of pipeline function of dplyr

One Down
Pre-workshop practice of basics would make it smoother
Downtime between exercises seems like it was waiting on challenges, when thats in my mind the "reach" goal and time doesn't need to be spent on it
More challenges would be helpful to practice manipulating the commands and troubleshooting errors
Make it more clear that the first day of "geospatial workshop in R" will barely cover any geospatial data in R
A bit confusing on the intro to Rasters
the lecture didn't always follow the online materials so I sometimes got lost going back and forth
+Too fast too much info. Brain fried! But the frequent break was helpful.
Maybe just use tidyverse from the start
A lot of info in one day
+It was hard to see the screen sometimes, which resulted in the symbols coming across incorrectly- please make it bigger!
++Would be nice to have etherpad-esque feacture with the code
lectures felt slow
The last segment of class was difficult to follow and I don't feel like I was able to retain much information. +JEKB - we did the first raster lesson at the end of the day, but their brains were fried by the intro to R by the time we got there. We repeated it more slowly the next morning
a little rushed at end. would also like a good source to learn how to keep up with packages. Last segment was hard to follow, but it was also end of the day so that makes it hard.
i am not really sure what the example raster data means, it is great to be able to polt and code the data, but without the context it is a little abstract
This was a bit too fast pased for me but that could be because I am a beginner and we are only doing a 2 day workshop
I'm not sure if I downloaded everything properly (I guess I'll find that out tomorrow!); so I'm still a bit confused as to what programs/packages we NEED and WHY
little bit more review of R and dplyr than I was expecting (but never bad to get a reminder)

Day 2

One up
Vectors
Challenges +
I am no longer as intimidated by geospatial analysis in R as I was before; I feel like I can do it now
Positive environment for learning! + ++++
The vector examples and source code were super helpful, and I will definitely use them in the future+
the online resources and cheat sheets were a great resource++
Staff help was very helpful Thank you!! +++++++++
Laid back atmosphere (instruction without a "classroom" feel that allowed tangents to assist on open questions
goodintro to new packages! ++
vector portion was great!
arithmetic function building for raster datasets will be super helpful for me!
I actually can do R!
patience by all was amazing, teachers and assistant and students!! thanks to all!
the additional resources and cheatsheets are super good as well.
I enjoyed broader context, noticed that Jarrett often times gave us the "transitional statements" that filled the gap between code
nice pacing of breaks! helped me not feel burnt out

One down
Rasters
We sholud have 3 day instead of 2 day so we don't have to hurry so much++++
Too much information to quick- need more on what parts of code actually doing behind the scenes-
would like to see work flow for field collected GPS data into R+++
I feel that seperating the geospatial theory, coding and learning R and then the manipulaiton of data in a different way would be hrelpful - coding and R first --> then geospatial concepts with vector data handling ---> then raster data handling. this i show the knowledge was presented in Intro. to GIS and I found it to be more cohesive, even having zero background in that case.
Although it may not be the carpentry way, learning R and the geospatial work must be seperated out into different courses or days
more manipulation of raster data+
pacing of the morning section was too fast to be able to do in action - easiest just to follow the notes
did not like repeating the material from the end of the day yesterday - wanted to move on to cover more/new material+
rename the raster files to something simpler and less cumbersome (and easier to follow along)
++++would have loved to have reached time-series stuff +++
the instructors had different styles, which made it hard to follow sometimes (I had a hard time following raster). Not everyone followed livecoding style+
Either extend the workshop to 3 days or shorten the R review from day 1 (so we have more time for geospatial!)

Comments from sticky notes with number of sticky notes that echoed the same theme:

I thought these were also insightful - we summarized our sticky note feedback with # of notes per general theme -

Q: What was your favorite part of the course?
13: liked overlaying vector and raster data
7: liked R/sf (general appreciation)
2: versatility of ggplot2
1: quick plotting in R (vs ArcGIS)
1: liked the reference materials for further self-learning

Q: What one thing would you like to change in this course?
6: get into spatial materials earlier
5: 3 day course / adjust balance of base R material vs. spatial material
4: logistics of time to copy code, files were a bit difficult to read in because of file names, number of files, files nested in folders (clean up provided data)
2: more general R material
1: more break points
1: more geospatial calculations and stats
1: example of spreadsheet of field collected data brought into R
1: combine R tutorials into the spatial parts to allow more time to focus one spatial components.

@jebyrnes
Copy link

Notes on Geospatial Lesson from Workshop at UMB

2019-01-22 and 23

From @DrK-Lo

Intro to Raster Data in R: 1 hour

Plot Raster Data in R: 1 hour (start 11am)

Mistake?" The bin intervals are shown using ( to mean inclusive and ] to mean exclusive. For example: (305, 342] means “from 305 through 341”."
Challenge: Plot Using Custom Breaks - students need hint on new options: ggtitle, xlab, and ylab

Reproject Raster Data in R and Raster Calculations in R: 1 hr 30 min (start 1pm)

The instructor skimmed over some things and focused on the core
lines of code. We did the last activity for 30 min over a coffee break

Open and Plot Shapefiles in R start 2:30pm end 2:50 pm, spent 20-30 min on activity

Explore and Plot by Shapefile Attributes start 3:15 pm

Challenge: Subset Spatial Line Objects Part 1 and Part 2: did challenge

  • 3:30 Start color customization for line type (abbreviated?)
  • 3:43 start Challenge: Plot Polygon by Attribute US data
  • 4:05 plot multiple vectors on same plot
  • 4:12 load the chm_HARV and add show example of raster and vector
  • 4:17 talked about everything

Alternate timeline:

Morning 1: Intro to R

After lunch 1: set up project, dplyr, ggplot (1.5 hr)
  • Intro to spatial stuff (1 hr)
  • start 1st raster lesson (1 hr)

Morning 2: plot raster

  • project raster
  • raster calculation

Afternoon 2: open shapefile

plot shapefile and customize
project shapefile
(from US map excercise could move straight to Handling Spatial Projection & CRS in R
and add example of plotting raster and shapefile together)
using online databases to get shapefile for your field site map

Other thoughts:

A lot of the challenges could be faster if we provide the code for the students
to load the files, and provide the functions that they haven't learned before

drakeasberry added a commit to marwahaha/r-raster-vector-geospatial that referenced this issue Nov 22, 2020
This pull request has gone stale. Referencing pull request changes, @JTSA comments and notes by @DrK-Lo found at datacarpentry/geospatial-workshop#31. Teaching time in episode 3 Reproject Raster Data will not be changed at this time.
zkamvar pushed a commit to fishtree-attempt/r-raster-vector-geospatial that referenced this issue Oct 17, 2022
This pull request has gone stale. Referencing pull request changes,  =@=jtsa comments and notes by  =@=DrK-Lo found at datacarpentry/geospatial-workshop/issues/31. Teaching time in episode 3 Reproject Raster Data will not be changed at this time.
zkamvar pushed a commit to datacarpentry/r-raster-vector-geospatial that referenced this issue Feb 7, 2023
This pull request has gone stale. Referencing pull request changes, @JTSA comments and notes by @DrK-Lo found at datacarpentry/geospatial-workshop#31. Teaching time in episode 3 Reproject Raster Data will not be changed at this time.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:discussion Discussion or feedback about the lesson
Projects
None yet
Development

No branches or pull requests

4 participants