I was recently asked for some of my code that I used in a paper. First of all, I should firmly state that people should share code. Sharing and openly sharing ideas is, after all, the hallmark of academic research.
I am not often asked for code, and I had a few reactions to a recent request, which was made by a student. My first reaction was that of a professor: should I give the student a fish or teach the student how to fish? When I am usually asked for code, the request is extremely short with no context given, meaning that it is hard for me to gauge how hard the student had tried to get the program to work. Was there a typo in the paper that is causing a problem? Where are they stuck? What types of programming errors are they getting? Are they simply being lazy? I just don’t know.
This particular request relied on code that was about 20 lines of code. Most of my projects are longer, usually comprising of hundreds or thousands of lines of code (I don’t really count, but they can get hairy). I found myself wondering if the requester–a graduate student–really had thought about how to run the code or was just emailing me. I exchanged a few emails with the requester before sending my code to make sure that I wasn’t doing someone’s homework for them.
Another recent request asked for code in a paper that I did the computational work for when I was in graduate school. I looked at the pseudo-code in the paper draft and wrote the code. It took me a day or so to get my code working, but it wasn’t particularly painful. In this case, I was confident that the paper was clear and unambiguous about how to write the code.
My second concern about sharing code is–and I’m being honest here–my code is one giant hack. I am not a software programmer. I know what good, elegant code looks like, and it’s not mine. I often have to cobble together multiple programs to solve a problem from beginning to end. I often write a script to run many copies of the same program with different inputs. I always write code for analyzing the solutions and creating figures. Over the years, I have gotten good at making my code readable to me, so that I can come back to it after months or years and figure out what I did. But that’s not the same thing as being readable to someone else. This is a long way of saying that I’m a little embarrassed about sharing my code with others. Maybe I’m just prudent and am being too hard on myself. But I am married to a software programmer, so I very aware of how high the bar really is for “good” code.
Having someone look at my code is like inviting someone into my house before straightening up first. It’s one thing to show my messy code to a collaborator but it’s another thing to show my messy code to a stranger. Sharing papers and tech reports is different–they are polished so they are OK to share. This can be somewhat addressed by commenting code better. I always start off commenting code well, but during the fog of debugging, my code usually gets a little out of control, and it’s hard to reign in after awhile. (I’ve seen other people’s code. I have some good programming habits–my code could be much, much worse).
However, as I am learning in my discrete optimization course this semester, even simple programming assignments such as implementing the Secretary Problem Markov decision process model can be incredibly difficult for PhD students. They can benefit from looking at my code. My homework solution code isn’t as wild and unruly as my research code. I’m getting used to sharing my code for the homework solutions.
On a related note, this post by Panos Ipeirotis reflects on how to make code more robust to changes, since old code often does not run if it relies on old libraries. Dr. Ipeirotis is a computer scientist, and it sounds like he writes more elegant code than I do. I’m still in square one, meaning that I try to make my code readable to someone else.
How self-conscious are you about sharing your code?