Establishing a Computational Biology Flipped Classroom

17 min readJul 8, 2018

A hypothetical diagram of known bullet holes in returning airplanes illustrated as red points. Image Courtesy McGeddon, Wikimedia Commons.

Adapted from a keynote talk delivered at ISMB 2018 in the Education Community of Special Interest Group meeting. Later adapted for and published in PLOS Computational Biology.

Abraham Wald and Survivorship Bias in the Lecture

Abraham Wald was a Jewish émigré to the U.S. from Transylvania who was hired by the Statistical Research Group, a Columbia University team tasked with working on military problems during World War II. One problem Wald worked on was to examine the bullet distribution of airplanes returning from the battlefield and determine where to strategically place armor, since armoring the entire plane would make it too bulky and consume too much fuel. A hypothetical diagram showing a distribution of Wald’s data is shown in the figure above.

The knee-jerk reaction to Wald’s problem is to place armor over the locations of the red dots. However, Wald’s insight was that the armor should go not where the dots are, but where the dots aren’t. That is, the bullet holes on planes are likely uniformly distributed, but we don’t observe this uniformity in our data because we are missing the planes that didn’t return for analysis because they were shot down. (For a lengthier discussion of Wald, see Jordan Ellenberg’s Medium post.)

Wald’s analysis offers an excellent example of survivorship bias, in which we are biased toward a particular conclusion because of the invisibility of some subset of our data. The oft-cited ancient example of survivorship bias is from the cynic Diogenes. When shown paintings of shipwreck survivors and asked how he could fail to see divine Providence in their survival, he replied, “Why, I say that their pictures are not here who were cast away, who are by much the greater number”.

What does all this have to do with education? As professors, we are the survivors of the lecture. We are the “A” students who — at least in our field of expertise — could sit through an hour-long lecture and actually internalize the information that our own instructors were throwing at us. We are so enamored with the lecture that we organize our conferences around it, lapping up talks for days on end. But have we stopped to ask ourselves whether all this lecturing is what is best for our students?

The idea that the lecture is flawed is neither controversial nor new. In 1984, famed educational psychologist Benjamin Bloom published a landmark paper after examining student performance under different classroom mechanisms. What he found was that one-on-one tutoring out-performed the conventional 30:1 lecture by approximately two standard deviations. That is, a “C” student in a lecture-based class would on average be an “A” student if they had learned the same material via individualized tutoring. I prefer to view this result inversely: there are perfectly capable “A” students that we are transforming into “C” students because by lecturing, we are not teaching them correctly.

Bloom’s paper should have rung the death knell of the lecture, at least in the STEM classroom. Yet here we are, decades later, still lecturing away, and charging more for it than ever. So what can we do instead?

What is a Flipped Classroom? or “How I Learned to Stop Worrying and Hate the Lecture”

Why might the traditional lecture be a bad educational vehicle? Let us consider another of Bloom’s namesake educational psychology paradigms, Bloom’s taxonomy, organized into a pyramid shape in the figure below. One could criticize this pyramid as reductive, but we clearly want our students to move upward in the pyramid, enabling them to apply and evaluate what they have learned within new contexts. Yet where in the typical lecture are we providing them with anything other than — at best — the bottom two levels of the pyramid?

A revised 2001 version of Bloom’s Taxonomy (1956), organized into a pyramid. Higher-level skills proceed upward.

The best definition of a flipped class is one in which students prepare these bottom two levels on their own, freeing up valuable in-class time for other activities. In a worst case, this time can be used to ensure understanding and provide students with timely feedback on their work. Ideally, class time is devoted to assessments that build higher-level student abilities.

I prefer the definition of a flipped class proposed by Bishop and Verleger in 2013. As summarized in the figure below, materials that students review outside of class must be automated in some way. This definition excludes class structures in which students simply read a PDF or watch a lecture video without any exercises as pre-class instruction. Their definition also helps indicate the clear bottleneck in implementing flipped classes. How can we provide explicit, automated instruction to students in an interdisciplinary STEM field like computational biology?

An overview of the flipped classroom, from Bishop and Verleger 2013.

Fortunately, we are in the midst of a boom in automated online materials. In my own case, I was fortunate enough to co-develop (with Pavel Pevzner) the first massive open online course (MOOC) in computational biology, which first launched in 2013. Yet although I have always been interested in education, we never imagined designing a MOOC. Our dream was to write what we called a “superbook”, which we fleshed out into our own acronym: a massive adaptive interactive text, or MAIT. This text product, called Bioinformatics Algorithms, became the engine for our courses, and in a Communications of the ACM article, we unpack how this acronym pertains to our project. A brief summary follows.

Our amazing development team for the Bioinformatics Algorithms textbook project.

Massive: It does not suffice to label an educational venture “massive” if it receives tens of thousands of enrolled learners. It must also benefit from large amounts of funding and much hard work on the part of its creators. We received gracious funding from three different bodies (HHMI, NIH, and the Russian Ministry of Education and Science) and organized a team of course developers who helped us bring our materials to life with many thousands of person-hours of effort over five years. See the accompanying on the side for our amazing team.

A series of learning paths through our interactive textbook. The central spine corresponds to required modules. Modules off this spine are optional and provide learners with multiple paths through the same content. Each color-coded text box corresponds to one unit of content, corresponding to one page in the browser with its own embedded discussion forum.

Adaptive: For any education project to work, it must respond to the diverse needs of its learners. The figure below shows an excerpt of our interactive text. Each square represents a unit of content, and squares are color-coded according to the types of exercises present at each unit. The central spine of the figure on the side corresponds to just 20 minutes of a standard lecture; yet when we presented this material to students, they provided many insightful comments in the form of hundreds of discussion forum posts. We have therefore integrated both FAQs and remedial modules into the text and have worked to present these modules only to learners who struggle with certain assessments, allowing them to divert off the text and return back when ready. To facilitate this adaptivity, we were fortunate enough to collaborate with the education start-up Stepik, whose founders were willing to tailor their platform to our needs as content creators.
Interactive: Building an adaptive text meant that the exercises and programming assignments in our book needed to be “just in time”, presented at the exact moment they’re needed to help the learner build their own understanding and avoid misconceptions. We also incorporated “Stop and Think” questions into the text encouraging the learner to contemplate some aspect of the material that often bridges to the next topic. Furthermore, interactivity should mean peer-to-peer interactivity as well; every page of our interactive text has its own embedded discussion forum.
Text: Reliance on video is risky, since after spending many hours producing lecture videos, these lectures become obsolete almost immediately. Text-based content provides authors the flexibility to design a truly adaptive product that can change quickly according to learner needs. So although we did record studio-quality lecture videos accompanying the material (and published to our own YouTube channel), the interactive textbook has always been our main focus. And when we have asked our online learners what they prefer to help them learn, they resoundingly have voted in favor of our text.

A project like Bioinformatics Algorithms offers the ideal missing link in implementing the automation component of a flipped class. After we had written this text, Pavel and I vowed that we would never lecture again. But how should we flip our classroom?

Five Years of Organizing a Flipped Classroom for Bioinformatics

I have worked on implementing a flipped class for the past five years. In what follows, I will discuss the successes and pitfalls that I have encountered along the way. If you are considering flipping a class — whether in computational biology or not — I hope that this narrative can prove useful so that your course proves successful.

Spring 2014: Molecular Sequence Analysis (UC San Diego)

In spring 2014, while still a PhD student, I served as a TA for this course, with sections for both undergraduate computer science students and PhD bioinformatics students and around 70 students across the two sections. Pavel instructed the course, and I worked with him to set up weekly discussions with students.

While completing a weekly interactive reading and a pre-class comprehension quiz, we asked students to draft a writeup of their progress through the text, noting:

everything that they found challenging;
any difficulties that they were unable to resolve on their own;
any additional questions (however tangential) that they might have about the material.

We then counted this writeup as part of a required participation grade in the course.

Although the in-class meetings occasionally featured guided group exercises, we largely treated them as open Q&A sessions, with the instructor addressing open student questions. After this session, students would be ready to complete a homework assignment building further understanding.

What did we learn from this first flipped class experience? It may seem that running a flipped class would facilitate instructor laziness — just 90 minutes of effort per week! Instead, we found that students will ask a plentitude of wonderful questions, many of which we did not anticipate even after having many thousands of online learners complete the same material. On average, it takes the instructor at least an hour of time for every 10 students enrolled in a class just to organize their questions before class. For this class, it was a full day’s work just to prepare for the in-class Q&A.

Nevertheless, we found that an instructor-led Q&A was an awful way to construct a class. Despite our best intentions to defeat the lecture, it had reared its head: we had returned to an instructor-centric classroom in which 35 students listened to a single instructor spout answers at the front of the room. This setup became particularly troubling in light of the following issue.

Issue 1: How do we motivate possibly unmotivated students to engage in the flipped course?
In course evaluations, many students were wildly positive, noting that they understood the material far deeper than they could have ever anticipated. A few students despised the course setup. This “bimodality” of student enthusiasm is an outcome that we have heard from others who have tried to flip a course. Its apparent cause is that the lecture asks so little of its students that growing pains can arise when the flipped classroom starts asking for more. The solution is to ensure that student engagement is peer-to-peer, which makes students far more likely to be engaged. In my case I have learned to make part of the participation grade tied to in-class participation.

Spring 2016: Fundamentals of Bioinformatics (Carnegie Mellon)

In spring 2016, I was a bright-eyed new teaching-track professor at Carnegie Mellon, and I had the opportunity to teach my own version of the course. Six students enrolled in this first offering of the course, which was designed as an elective for first-year MS students in our School of Computer Science.

In an effort to facilitate a student-centric classroom, I performed a critical shift in the in-class portion of the course. As the instructor, I posed anonymous student questions and demanded that students must work together to answer the questions, while I facilitated the discussion as a “guide on the side” (see King, 1993).

Despite a small sample size, it was immediately clear from student engagement how much peer responses helped students. (Peer instruction is something I’ve worked to grow in all of my courses.) Although 90 minutes of peer Q&A did grow monotonous at times, the course felt like an overwhelming success, which was confirmed by superlative course evaluations from all my students. I only received one qualitative course evaluation in the free response category; an excerpt is below.

“Dr. Compeau… has fantastic teaching styles and techniques which is a not a common quality for university professors if not rare. I truly recommend [ this course] to others as by far this is my favorite course at CMU. The interactive textbook is also a gem, designed in a very informative yet not boring way.”

This was just my second semester as a professor, and yet I had already ascended Olympus! My work with this course was finished; or so I thought.

Spring 2017: Fundamentals of Bioinformatics (Carnegie Mellon)

In spring 2017, my course became a required course for a subset of 30 MS students; this course would be their only real exposure to computational biology. Building on the success of my previous offering, I split the class into four equally sized groups. This meant quadrupling the amount of in-class time that I would spend on the course, but my experience from the previous year had been so rewarding that I did not mind.

To address the potential monotony of peer Q&A sessions, I restructured the in-class sessions with the following weekly structure:

A short mini-lecture (often led by the TA) on a particularly engaging and often tangential topic. The purpose of this mini-lecture was not to teach basic content, but rather to provide additional context on a fun concept that a lecture-based course would not have time for. A classic example was that in covering genome assembly, we had time to teach students exactly how sequencing by synthesis works, and how it can be generalized to form paired-end reads.
A more compact instructor-facilitated peer Q&A, with some group exercises peppered in.
An instructor-facilitated group discussion guiding students to discover the next week’s material. For example, when the next week’s material centered on evolutionary tree construction, I first gave students guided questions probing them to think about how we would compare multiple species using genomic data. When they deduced that we would need a tree structure, I led them to devise computational problems modeling evolutionary tree construction. We then discussed what algorithms we could use to construct evolutionary trees. I have been amazed at how well students have responded to this setup; it is common for a student group to rediscover the classic algorithm UPGMA in just 20 to 30 minutes with limited instructor guidance.

Although the in-class sessions had even greater structure, scaling up the course presented a series of issues. Teaching students for whom this was a required course and likely their only contact point with computational biology meant lower student interest. This phenomenon might have been manageable except for the following lesson.

Issue 2: Selecting the correct cohort size is vital.
Groups with 7–8 students wound up being far too small. Despite restructuring the groups, I noticed that I tended to have two superstar groups and two groups that struggled to get off the ground. Because of the smaller size, if a group had just a couple of unenthusiastic students, their lack of interest tended to permeate the group. This is a known phenomenon, and I deeply regret not speaking to the educational experts at our university’s Eberly Center before I set up the groups; when I spoke to them as part of my course post mortem, they immediately informed me that my groups were too small and that the ideal cohort size is 20–25 students.

A second problem that I had never anticipated was that perfectly capable students might — gulp — actually prefer a traditional lecture to the flipped classroom. Why this was the case took me some time to realize and leads to the next issue.

Issue 3: Students may conceive of a professor as the “font of all wisdom”.
Most of my students in this course were, for the first time, international students. Many of these students came from cultures in which the lecture is even more ingrained as the only acceptable vehicle for learning than it is in domestic students. Because of the apparently radical format of the course, it became apparent that some students even distrusted my expertise. More generally, some students may feel that a goal of a classroom is to be wowed by the instructor’s knowledge — if the student can only lap an occasional drop from the fountain, then it is the student’s problem, not the instructor’s. This lesson taught me that the most important part of a flipped course is that the flipped course must begin with a sales pitch. To reach all students, the instructor must be willing to spend 30–45 minutes at the start of the course articulating the weaknesses of the lecture, the pedagogical goals of the course, and why the flipped course is being established as a student-centric environment. It must also convince students clearly that the types of skills that the course’s exams will test are going to be the “higher-level” ones that are developed in the flipped classroom. (Special thanks to Charlie Garrod and Mark Stehlik for their thoughts here.)

Spring 2018: Fundamentals of Bioinformatics (Carnegie Mellon)

In this year’s iteration of my course, I kept the entire cohort of students in one discussion section, which I lengthened to a single 2–2.5 hour session per week. At the beginning of each class, I gave a 20–30 minute review of learning objectives from the week. This measure helped ensure that students were fully prepared for later discussion and that the material was fresh in their minds.

Students then split into small groups of 4–5 students — perhaps the ideal group size for problem-solving — and completed three separate instructor- and TA-guided sessions.

Common discussion questions from the reading.
Challenge problems letting students apply what they’ve learned to new contexts.
A guided write-up with exercises setting up the next week’s material.

A couple of remarks on the structure. First, note that I returned to employing small groups, but only within the context of a larger classroom containing several smaller groups. In doing this, I allowed students to form their own groups, which I was careful to monitor. All groups were engaged and productive, in part because each of these three sessions was typically no longer than 30 minutes, and in part because groups seemed to want to perform well in front of their peers.

Second, an example of a set of discussion questions and challenge problems is given here. (For this particular week, the challenge problems are lengthier, so I did not include as many discussion questions.) The challenge problems allow students to apply dynamic programming, a concept that they have learned about only in the context of sequence alignment, to the entirely new problem of virus attenuation. What excited me about this series of challenge problems is that all of the groups essentially reached the correct conclusion, and at the end of the discussion, I could tell them that they had just re-built the engine of a paper with several hundred citations. What more could we want from our teaching lives than to equip our students to replicate scientific research in the classroom?

That having been said, I still kick the tires after every run of a course, and I don’t think that the course went perfectly. Why?

Issue 4: “Course creep” can bloat a course.
In advance of the course, I anticipated that students might become overworked if I expanded each week’s meeting by a week. Accordingly, I trimmed a week’s worth of material from the course and shortened certain homework assignments by a bit. That having been said, 2.5 hours on one topic can run the risk of exhausting already strung out students; I would perhaps prefer a structure with two 75-minute sessions. Course bloat is a phenomenon that instructors of flipped classes often report. If we start asking our students to learn what we are teaching them at a deeper level, we may simply have to sacrifice some breadth.

So … Is All This Worth It?

This was a lot of work to devote to flipping a single course, but I was encouraged to see how my final exam scores in 2018 compared to those of last year (see figure below). The two exams were of comparable difficulty with similar rubrics, and yet there was a stark increase in student performance. I attribute this uptick to better-structured in-class sessions with the in-class challenge problems that I added to the class this year, as well as my “sales pitch” on why students should be motivated to trust in the flipped class and how their participation would correlate directly to their performance in exams. I would note that my final exam is designed to be quite challenging and test the student’s ability to apply what they’ve learned to new problems — I cannot imagine how a student in a lecture could do well with it.

Comparing final exam scores for Fundamentals of Bioinformatics in 2017 and 2018 reveals a significant upward trend in scores, especially for the bottom 75% of learners.

Yet although I feel that my course has been a success, research results on flipped courses tend to be mixed. Why might that be the case?

First, despite huge efforts into online education, existing materials for automating students’ pre-class learning can be spotty. Asking every instructor to construct their own materials is perhaps unrealistic.

Furthermore, the lecture is stable in that even the worst lectures proceed from point A to point B with few surprises. But if you make a mistake in setting up a flipped class structure, it becomes evident quite quickly; I made what I thought was a slight error in setting up group sizes in 2017 and found it difficult to recover.

Hard data can prove very helpful in educational research, but it’s not the only thing that we should trust as educators. Teaching is a visceral, exhausting, emotional exercise when done right — why not trust our feelings to help find the best ways to reach our students? And my feelings are that you would have to handcuff me to the lectern if I had to convert my bioinformatics fundamentals course back to a lecture structure.

That having been said, this post is not meant to be propaganda in favor of the flipped course; there are many ways to build student-centric environments. In active learning, we take a constructivist approach to student learning, finding opportunities for students to learn material by doing rather than listening whenever possible. This is the “just in time” premise of Bioinformatics Algorithms, as well as the guided discussion and challenge problem sessions that I give students in my flipped course. It’s also how I have set up Programming for Scientists; all coding is done as a code-along, and I rarely spend more than 10 minutes in the course without asking students to complete an exercise, either individually or in groups. Freeman et al. surveyed over 200 studies and found that students in courses with active learning outperform those in traditional lectures by around half of a standard deviation, with a 50% reduction in failure/withdrawal rates as well. And Jensen et al. have argued that what makes a flipped classroom really outperform a lecture may be simply attributable to its incorporation of active learning.

What this post is meant to be is a call to end the lecture in STEM courses as we know it. Education might be one of the last fields of human endeavor to be disrupted by automation, but given how large the higher education market is, automation is coming. There has been a recent downturn in the popularity of online education — for a variety of reasons — but it will prove temporary in the wider span of the 21st Century. As instructors, we have a rare opportunity to leverage this automation and improve our students’ learning. If we don’t, we may very well find ourselves cast to the scrap heap.

Originally published at http://compeau.cbd.cmu.edu on July 8, 2018.