When not to program

It’s end of the semester time ’round here, and students I’m TAing are frantically trying to get their final projects working. One of the emails I fielded today was basically asking, “Is there a way to get MATLAB to automate this for me”.

The answer to the question, of course, is “yes”, but knowing that student’s comfort with programming, I phrased it more like, “Yes, you could do it by XYZ, but it’s probably just as easy to do it manually for the 6 data points you’re interested in”. What I didn’t write (but really meant) was essentially, “If you have to ask, the answer is no”.

It’s worth pointing out that the course isn’t designed for teaching general programming concepts (unlike other courses I’ve been involved with), and instead uses a very restricted set of MATLAB and programming concepts as a tool to understand biological modeling. It’s so restricted a set of concepts, in fact, that back in week 3 or 4, I wrote up a handout that was essentially a step-by-step recipe for doing the coding for all but one of the rest of the labs.

In grad school, there’s often two modes with somewhat opposing goals. First and foremost, you want to get stuff done. But you also want to leave things in a state where it will be possible to quickly and easily repeat things in the future. Sometimes, that latter goal is achieved by stopping before hand to think about the way to structure code, sometimes you even need to learn more skills which you will apply in the future (either programming or bench techniques or *gasp* math).

So in this student’s case, the answer (given that the final project is due in just a few days) is probably to do it the stupid, manual way that’s less elegant, but also much much faster than taking the time to really grok loops. In the long run of grad school, it won’t always be obvious where to spend more time making things faster, and where to just grind it out. And, of course, one final thought is that sometimes you can take longer to make it faster, and not succeed:

Teaching differential equations to biologists

Teaching, and especially teaching the same thing over again, is always an effort to iteratively refine how best to convey the information and understanding in your head to your students.

Analytical solutions

The first two equations we talked about in class were for exponential growth and logistic growth, both of which (happen to) have analytic solutions. It’s nice that they do, but I think it has perhaps gotten people into the frame of mind where they’re looking for those analytical solutions, or accidentally plugging them in when they should be using the differential equations.

If I were handed the reins for the next time this course will be offered, I’d try to avoid even writing the analytic solution to any of the differential equations.

Does this mean I think students should never be taught how to find analytical solutions? Absolutely not; I think there’s a lot of beautiful math that goes into deriving them. But for biology undergrads, many of whom haven’t taken a math class since AP Calculus in High School, the analytical solutions are a red herring. I think it’s worthwhile to help them understand what the graph itself will be shaped like, but being able to find the exact functional relationship isn’t usually necessary.

Walk through an example of numerically solving an ODE

For the way the course is set up, I think it’s necessary to step through an example of doing numerical integration—any numerical integration, even the Euler Method, which has terrible theoretical properties.

Thinking back on the first lecture, we actually did do this, but I think the problem was that it was presented as “Here’s a simulation of bacterial doubling”, rather than “here’s what ODE solvers do”. Did some students pick up on the fact that it’s actually one and the same thing? Maybe, but given that it didn’t occur to me, the TA, until writing this just now, I doubt if very many did.

Come up with a useful definition of steady state

… and stick to it. One of the things I noticed is that the lectures never really covered what we mean by steady state, and actually seems to use a different version of steady state depending on the context. I think it’s worthwhile spending a good amount of time determining the steady states for a system, as well as discussing stability (though none of my students seemed to have a problem with the idea of an unstable steady state).

I think distinguishing between a true steady state, and a few different examples of pseudo-steady state, where the system is still moving, but no longer dynamically interesting. In one case, autocatalysis in prion disease, the system undergoes a rapid shift from one degradable form to a non-degrading form, and so ultimately the system ended up producing an unbounded amount of the non-degradable form; while this isn’t a true steady state, I think that’s one kind of pseudo-steady state worth looking at. Perhaps more interestingly, in a separate example we looked at the pharmacokinetics of repeated doses of a medication, and due to the fact that the patient was taking discrete doses, the equations themselves don’t have a real steady state; nevertheless, the levels of the drug stabilizes, which I think offers lots of great hooks to calculus, and the idea that as we offer the doses more frequently, the plot becomes more of a smooth curve.

Flux analysis

I think one of the best ways to analyze systems, at least in a power of the technique vs simplicity sense, is to look at the total flux. At least in simple systems, there are certain quantities that are conserved or conserved in certain parts of the network, and it’s often useful to look at where that quantity ends up. For instance, I showed a few students who came to my office hours today that I could predict the quantity of a certain element in a system using that kind of analysis, and got pretty good accuracy.

Some of this grousing is really just Monday-morning quarterbacking, but it is really nice to have the opportunity to think about what I’d want to teach in the future, and the best ways to do it.

Should you share your data?

Seems the big scientific brouhaha at the moment is PLoS’s recent (clarifications to their) policy that data for a paper will be shared. In my field, the answer to the title of this post is, “OF COURSE!”

However, I get that there are different cultures, that vary by field, about what kind of data sharing is expected, and how much credit should be given to those who share data (citation, certainly, but what about authorship?). As has been discussed, there are also “lots” of corner-cases about where and how exactly the policy does or should apply. My guess is that PLoS actually left this intentionally vague, so that Editors can use their judgement (although hopefully they have been trained on what exactly the policy is meant to do; I can’t find the tweet that suggested this).

Continue reading Should you share your data?

Why you shouldn’t ask “why”

One major pastime for grad students is complaining about how bad our undergrads are (they, I’m sure, complain about us—I know I did when I was their age). They aren’t perfect, but sometimes their missteps can actually be enlightening.

In the most recent lab I graded, they generated this plot:
2 strains of bacterial growth
And then we asked them to consider why the two lines were parallel, even though the growth rate of the two states of bacteria (which can interconvert) are so different. One of the responses was: “because the population sizes are increasing at the same rate”. To which I whacked my head on my desk, and wrote in “yes, but why are they increasing at the same rate?”.  

I’ve been thinking about it more over the last couple days, and I realized the problem is not with the student, but with the question. “Why” tends to be too broad, with many possible answers. Unless the person you’re asking is interested in the exact same things you are, and even possibly in the same mood, they might then answer the “why” question with a totally different (though equally correct) answer.

Continue reading Why you shouldn’t ask “why”

Teaching Programming

This past week I starting to teach lab sections for an “Intro to Systems Biology” class. The class marks a return to Matlab for me, which is the first really dynamic language I learned, but a language I haven’t particularly been missing. We’ll see how it goes, but I’ve already identified some rough spots from the lectures that I think will end up being confusing:
Continue reading Teaching Programming

On the “quilt plot”

One of the big blow ups on twitter last week (at least in the circles that I follow) was the “quilt plot”, an article recently published in PLoS ONE. The quilt plot is actually just a heat map. Not a heat map with other bits removed, but a heat map. The article itself says that “they produce a similar graphical display to ‘heat maps’ when the ‘clustering’ and ‘dendrogram’ options are turned off”, but that misunderstands what a heat map is. While the wikipedia page for heat maps has a hierarchically clustered heat map as the example image at the top, the examples farther down do not. To claim that there is something new here is to fundamentally misunderstand what already existed.

It’s not really new, but scientists can’t be familiar with the entirety of the literature, and often “rediscover” old techniques from other fields. What one would hope, however, is that the journal itself is able to spot this, and respond appropriately. PLoS ONE actually has guidelines for whether new software methods should be published. Briefly, they say they must have

  1. Utility
  2. Validation
  3. Accessibility

Continue reading On the “quilt plot”

Keeping Busy

Sometimes, despite all my best planning, I run out of tasks that need to be done. This seems to happen to me most often in the experimental phases of a project, where I first need to do experiment A, and see what A looks like, before deciding whether experiment B or C is the most appropriate direct follow up.

There are, nevertheless, things I can do during this time while I’m waiting for A to finish (other than wasting time on Facebook or the like).

Continue reading Keeping Busy

Grad Students Aren’t Stupid…

…but maybe there should be fewer of them anyways.

I think perhaps I’m late to this game, but there’s been quite a bit of discussion about “Thinning the Ph.D. Herd”, a Slate piece by Rebecca Schuman about Johns Hopkins University’s plan to reduce enrollment in PhD Programs.  Sean Carroll, for instance, says “I’ll believe we should accept fewer grad students when I hear it from actual undergrads applying for grad school.”  There is quite a strong strain of people arguing that people should be allowed to freely choose what to do, and that reducing PhD enrollment is paternalistic on the part of JHU. People seem to think that by downsizing programs, JHU is saying “you only think you want a PhD, but you are mistakenly signing yourself up for a life of penury, so we’re saving you from that”.

Responses to the Slate article

Responses to the Slate article

It’s worth recalling Braess’s Paradox, however.  In essence, it says that people might be making locally optimal choices that lead to globally sub-optimal outcomes for everyone (including themselves), and limiting choice may actually improve results for everyone. Nobody in the hypothetical scenario is acting irrationally or without having considered all the options, but things are worse than if someone stepped in to constrain choices.

There’s two questions we need to ask: is the current PhD system the globally optimal system for the students in it, and if not, would unilaterally reducing the number of grad students improve the results. In the sciences, I’m actually not sure that we’re too far off—most people I know of (in my Tier 1 R1 highly ranked program) live comfortably on our stipends, receive health insurance and various perks, and typically don’t have trouble finding jobs that use their research skills.  I suspect the same is true for the students in Profs Carroll’s, Krugylak’s, and Booty’s departments as well.

The same is not obviously true for the humanities, even from top tier institutions.  My partner is in a highly ranked, unusually well funded humanities program, and the horror stories I hear are chilling.  Her stipend is, in a good year, 75% as much as mine (although $2-5k more than many other programs in the humanities), her program has a shockingly high attrition rate, jobs outside academia are considered outright failure, and a significant number of people end up as adjuncts, earning wages below the poverty line.  Having gotten her PhD in German, I suspect this is much the world that Rebecca Schuman is from as well.  Is this the globally optimal situation for graduate students? I think not.  Would reducing PhD enrollment help, especially if stipends are redirected to the ones that remain? I’m not certain, but I think it’s worth a shot.

Finally, I just want to make clear that reducing PhD enrollment isn’t the only way to improve the lives of grad students, nor even necessarily the best one.  But it is an option in the arsenal, and seems likely to have at least some positive effect.

New Year’s Resolutions

  1. Blog weekly.  It seems to be true that part of being a successful academic is in getting my name out there. Plus, the more you write, the better your writing tends to be, through a strange and little understood process that some call “practice”. To that end, I’ve created this blog shaped thing that I will attempt to write in with some regularity. See here for entry 1.
  2. Publish. I’ve got one small story that a roton worked on which the Boss thinks could be a small paper, and lord knows we probably aren’t going to do much else with it.  I’ve also got big plans for the next couple big experiments in my thesis project, which if I’m reasonably productive this spring while TAing, I ought to be able to get out the door by the end of summer, and hopefully in print by Christmas, assuming I come across nothing unexpected (ha!).
  3. Line up the next thing.  I’m expecting to graduate May 2015, and am currently researching potential post-doc labs. The timing is both flexible and in flux, depending in large part on 2-body problem considerations, but also on funding arrangements that I’m also in the process of researching.
  4. Relax. I’m often pretty bad at overscheduling myself, so the other resolutions notwithstanding, I’m hoping to keep as much downtime open as possible.

Vaguely blog-shaped musings