Highlighted Selections from:

Software Carpentry: lessons learned


DOI: 10.12688/f1000research.3-62.v1

Wilson G (2014) Software Carpentry: lessons learned [v1; ref status: indexed, http://f1000r.es/2x7] F1000Research 2014, 3:62.

p.2: the time it takes the “desktop majority” of scientists to produce a new computational result is increasingly dominated by how long it takes to write, test, debug, install, and maintain software. -- Highlighted mar 12, 2014

p.2: While their undergraduate programs may include a generic introduction to programming or a statistics or numerical methods course (in which they are often expected to pick up programming on their own), they are almost never told that version control exists, and rarely if ever shown how to design a maintainable program in a systematic way, or how to turn the last twenty commands they typed into a re-usable script. As a result, they routinely spend hours doing things that could be done in minutes, or don’t do things at all because they don’t know where to start. -- Highlighted mar 12, 2014

p.2: This is where Software Carpentry comes in. We ran 91 workshops for over 3500 scientists in 2013. In them, more than 100 volunteer instructors helped attendees learn about program design, task automation, version control, testing, and other unglamorous but time-tested skills -- Highlighted mar 12, 2014

p.2: The program also enhances their habits and routines, and leads them to adopt tools and techniques that are considered standard practice in the software industry. As a result, participants express extremely high levels of satisfaction with their involvement in Software Carpentry (85% learned what they hopedto learn; 95% would recommend the workshop to others). -- Highlighted mar 12, 2014

p.2: Despite these generally positive results, many researchers still find it hard to apply what we teach to their own work, and several of our experiments, most notably our attempts to teach online, have been failures. -- Highlighted mar 12, 2014

p.2: Textbook software engineering is not the right thing to teach most scientists. In particular, careful documentation of requirements and lots of up-front design are not appropriate for people who (almost by definition) do not yet know what they are trying to do. Agile development methods, which rose to prominence during this period, are a less bad fit to researchers’ needs, but even they are not well suited to the “solo grad student” model of working so common in science. -- Highlighted mar 12, 2014

p.3: Piecemeal improvement may be normal in open source development, but Wikipedia aside, it is still rare in other fields. In particular, people often use one another’s slide decks as starting points for their own courses, but rarely offer their changes back to the original author in order to improve them. This is partly because educators’ preferred file formats (Word, PowerPoint, and PDF) can’t be handled gracefully by existing version control systems, but more importantly, there simply isn’t a “culture of contribution” in education for projects like Software Carpentry to build on. -- Highlighted mar 12, 2014

p.3: Most importantly, the MOOC format didn’t work: only 5–10% of those who started with us completed the course, and the majority were people who already knew most of the material. Both figures are in line with completion rates and learner demographics for other MOOCs, but are no less disappointing because of that. -- Highlighted mar 12, 2014

p.3: Software Carpentry restarted once again in January 2012 with a new grant from the Sloan Foundation, and backing from the Mozilla Foundation. This time, the model was two-day intensive workshops like those pioneered by The Hacker Within, a grassroots group of grad students helping grad students at the University of Wisconsin Madison. -- Highlighted mar 12, 2014

p.4: So what does a typical workshop look like?

  • Day 1 a.m.: The Unix shell. We only show participants a dozen basic commands; the real aim is to introduce them to the idea of combining single-purpose tools (via pipes and filters) to achieve desired effects, and to getting the computer to repeat things (via command completion, history, and loops) so that people don’t have to.
  • Day 1 p.m.: Programming in Python (or sometimes R). The real goal is to show them when, why, and how to grow programs step-by-step as a set of comprehensible, reusable, and testable functions.
  • Day 2 a.m.: Version control. We begin by emphasizing how this is a better way to back up files than creating directories with names like “final”, “reallyfinal”, “reallyfinal_revised”, and so on, then show them that it’s also a better way to collaborate than FTP or Dropbox.
  • Day 2 p.m.: Using databases and SQL. The real goal is to show them what structured data actually is (in particular, why atomic values and keys are important) so that they will understand why it’s important to store information this way.

-- Highlighted mar 12, 2014

p.4: Some instructors still use the command-line Python interpreter, but a growing number have adopted the IPython Notebook, which has proven to be an excellent teaching and learning environment. -- Highlighted mar 12, 2014

p.5: We find workshops go a lot better if people come in groups (e.g., 4–5 people from one lab) or have other pre-existing ties (e.g., the same disciplinary background). They are less inhibited about asking questions, and can support each other (morally and technically) when the time comes to put what they’ve learned into practice after the workshop is over. -- Highlighted mar 12, 2014

p.5: Group sign-ups also yield much higher turnout from groups that are otherwise often under-represented, such as women and minority students, since they know in advance that they will be in a supportive environment. -- Highlighted mar 12, 2014

p.6: Pairing is a good practice in real life, and an even better way to teach: partners can not only help each other out during the practical, but can also clarify each other’s misconceptions when the solution is presented, and discuss common research interests during breaks. To facilitate this, we strongly prefer flat (dinner-style) seating to banked (theater-style) seating; this also makes it easier for helpers to reach learners who need assistance. -- Highlighted mar 12, 2014

p.6: Why do people volunteer as instructors? -- Highlighted mar 12, 2014

p.6: To build a reputation. Showing up to run a workshop is a great way for people to introduce themselves to colleagues, and to make contact with potential collaborators. This is probably the most important reason from Software Carpentry’s point of view, since it’s what makes our model sustainable. -- Highlighted mar 12, 2014

p.6: To help diversify the pipeline. Computing is 12–15% female, and that figure has been dropping since its high point in the 1980s. Some of our instructors are involved in part because they want to help break that cycle by participating in activities like our workshops for women in science and engineering. -- Highlighted mar 12, 2014