composition.al

My first year as a professor

On July 1, I celebrated my first anniversary of employment as an assistant professor at UC Santa Cruz! As my second year gets under way, now seems like a good time to recap how my first year went, quarter by quarter, and consider how I want things to go this year.1

Summer 2018

When I started at UCSC in July 2018, I had almost three months before the fall 2018 quarter began. Right away, I began working with fellow !!Con co-founder Maggie Zhou to begin the organization process for !!Con West, which was to be the UCSC version of !!Con. Unlike all the previous !!Con conferences we’d worked on (which had been organized by a group of friends who all already knew each other), this time we were creating a new organizing team from scratch by asking people to apply. About twenty-five people applied to be on the team, and after much deliberation, Maggie and I asked a subset of those people to join us. We held our first meeting (via videoconference, spanning multiple continents) on August 14, and a six-month-long mostly-remote conference organization process commenced. Meanwhile, I began to do some grass-roots organizing on campus over the summer. I printed up flyers and gave them to anyone I met. I also met with Max Kreminski to begin planning for what would eventually be the Computery Zine Fest that was co-located with !!Con (although Max and the rest of the CZF team did almost all the work of putting it together).

I also began preparing to teach my fall 2018 graduate seminar course on languages and abstractions for distributed programming. A seminar course isn’t like a lecture course, where I’m putting on The Lindsey Show for students three times a week; in a seminar course, the value proposition for students is less immediately obvious, since I mostly spend class time just sitting and listening to students talk about the papers they read. The main reason to take such a course, then, is because it has a thoughtfully curated collection of readings that form a coherent whole, presented in an order that highlights their relationships (unlike, say, the collection of papers you’d get by typing “distributed programming” into Google Scholar and picking the first twenty-five results that come up). So I spent a lot of time last summer working on choosing and reading papers for the course, as well as inviting a lot of guest speakers.

When I wasn’t doing the above things, taking care of various reviewing obligations, or taking data science MOOCs2, I spent a lot of my first three months on campus meeting people and learning my way around myriad campus systems — figuring out how to buy stuff, sign up for health insurance, fill out reimbursement forms and so on. I’m awful at paperwork, especially paperwork that has to be done on physical paper, so all this took a lot of time.

In retrospect, I wish I had gotten more Real Research done in those precious three months before I had to start teaching. On the other hand, it’s hard to get Real Research done when you don’t have any students and barely even know any students. There was one thing I worked on during this period that I hoped would turn into a successful research collaboration with someone and their students at another institution, but it ended up not working out.

Fall 2018

Fall started, and I began teaching my distributed programming seminar course to six enthusiastic Ph.D. students! The most time-consuming aspect of teaching this course was the massive amount of work I poured into our course blog. Every student contributed two posts to the blog, and I gave multiple rounds of detailed feedback on all twelve posts, learning a lot from the students in the process. The reactions we got to the blog made me feel like all the time and effort was worth it. (For those interested, I wrote a course retrospective that goes into a lot more detail about the blog and the rest of the course.)

Aside from editing the blog, setting up the blog infrastructure ate up some time, too. Since it was going to be a group blog, I wanted it to be easy for everyone to update. For this, my own blog, I use a static site generator that I run locally before pushing the generated files, but I didn’t want to make students install any software (and especially not Jekyll, since having a working Ruby environment is the hardest problem in computer science). So instead I figured out how to make the most of GitHub Pages’ site generation automation (which uses Jekyll behind the scenes) so that students could just push Markdown files to a GitHub repository. I looked at various Jekyll themes before settling on one auspiciously called “Minimal Mistakes”, which you can use with GitHub Pages via the remote_theme feature. The time I invested in figuring all this out has been worthwhile, since I used the same setup for my next course and expect to keep using it for future courses, too.

Since I didn’t have any students yet and needed to build a team to get research done, recruiting was a top priority for me this year, and so my secret ulterior motive in pouring so much effort into the blog was to be able to have something nice to point prospective students to that would signal to them what kinds of problems I’m interested in working on and what kind of aesthetic guides my work. I sent a lot of boilerplate replies to inquiries from prospective students that went something like, “If you want to learn more about what kind of research I could advise you in, then here, go look at this blog that my students and I are writing!”

Every Monday, Wednesday, and Friday, after my class ended, I would hurry across campus to go sit in on my colleague Peter Alvaro’s undergrad distributed systems course so that I could prepare to teach it myself in the spring. It was incredibly fun, and I learned a ton. I also did one guest lecture in Peter’s class, and that went well, too.

Early in the fall, I asked Peter if he wanted to go in with me as a co-PI on an NSF grant proposal, and he agreed. Looking back at emails I sent last September, the core of the idea I had then was “use Liquid Haskell to check that the commutativity and associativity assumptions that a CRDT library makes are actually true”. I also began meeting with a student of Peter’s, Kamala Ramasubramanian, who was interested in this project, and as we all talked about it, the idea evolved and expanded quite a lot. (Although Kamala ended up focusing on her own projects instead of this, I’m now on her dissertation committee and am excited to see where her work goes!)

Finally, while all this was going on, !!Con West preparations went into high gear, thanks to our amazingly thoughtful and hard-working team. We got over 200 talk proposals, and we were (with difficulty) eventually able to converge on a set of thirty accepted talks. I was overjoyed to see !!Con flyers (which I shamelessly handed out in a faculty meeting!) begin showing up on my colleagues’ doors. Also thrilling was how, when I took an initial pass over our Ph.D. applicants for fall 2019, out of the four or five who specifically mentioned wanting to work with me (which was exciting in itself), one actually said they had become interested in working with me in part because of my work on !!Con!

Winter 2019

Winter came, and I got a break from teaching (one quarter off from teaching in the first year being a nice thing that UCSC does for new faculty) and finally focused my attention more on research. Peter and I finished writing and submitting our grant proposal, then turned it into a SNAPL 2019 paper submission. (The SNAPL reviewer who thought our paper sounded sort of like a grant proposal was correct. I would point out, however, that sounding sort of like a grant proposal is not unusual for SNAPL.) The idea for this paper had its roots in the time I spent with the Reluplex team in 2017, which opened my eyes to the power of custom-built SMT solvers to exploit domain knowledge for efficiency. Reluplex got great performance results by raising the level of abstraction at which the solver was capable of reasoning, and so I had begun to wonder if one could build a custom SMT solver with baked-in knowledge of the kind of order-theoretic notions necessary to efficiently verify consistency properties of distributed systems.

In addition, it had always sort of bugged me that I felt like I could more or less understand the Reluplex decision procedure on paper, but that understanding the actual implementation involved wading into a swamp of low-level C++ that I didn’t relish trying to read. The tool of my dreams, I imagined, would let me write a high-level declarative specification that would look a lot like, oh, say, figure 5 of the Reluplex paper, and then I could press a button and have it spit out a reasonably efficient implementation of Reluplex. So I began to wonder if it was possible to have something that’s similar in spirit to the Delite framework for rapidly building high-performance DSLs, but instead for rapidly building high-performance theory solvers. I don’t yet know how we’re going to do it, but I want to take a shot at it. (If you want to work on this, maybe you should come do a Ph.D. with me!)

The SNAPL paper was accepted (yay!); the grant proposal was not, although the reviews were quite encouraging. I was mostly just happy to have actually gone through the process of submitting a grant proposal and learning what that process was like. I had co-written one successful NSF grant proposal with my advisor in grad school, but I had (thankfully) been sheltered from the actual submission process and the many required supplementary documents. My ProTip™ for any other new faculty member who needs to figure out how to write NSF grants: for your first one, get a more senior person (in my case, Peter) who knows their way around to be your co-PI. (But take my advice with a grain of salt, since the proposal wasn’t funded.)

Ten weeks off from teaching in winter also gave me the chance to do some traveling and speaking. In January, I gave an invited talk at Jane Street, and they produced a really nice video of the talk. Those familiar with the job talk genre will be able to tell that this talk was mostly ripped off from my job talk from spring 2018, and I didn’t previously have a video of that talk; now I do! Then in February, I was invited to Portland for a week to attend and speak at WG 2.16, the IFIP TC2 Working Group on Language Design. (Their invitation email started with “Dear Language Designer”, and apparently that’s the way to my heart.) I gave a talk called “Domain-Specific SMT Solving for Neural Network Verification (Or Anything Else)” that was largely about other people’s work on Reluplex, but a little bit about the ideas that went into Peter’s and my SNAPL paper. It was my first time at WG 2.16, and I had a lot of fun catching up with some old friends, like Ron and Michael, and meeting some new ones, too.

Back at UCSC, I served on the hiring committee for our Software Foundations faculty search, which was a lot of work and involved many tough decisions. To my immense delight, the search resulted in two great new people joining our department in 2020, Tyler Sorensen and Daniel Fremont! I feel very fortunate to have joined UCSC at a moment when we’re building a critical mass of PL-and-PL-adjacent folk who support and collaborate with one another.

On the student recruiting front, I reviewed applications to our Ph.D. and MS programs and made several offers to strong applicants (some of whom came to our Ph.D. admit visit day, which was fun). I also started working in an unofficial capacity with an undergrad who had approached me about research and was interested in learning Haskell, and we started planning out a project that would fit in with the domain-specific solver stuff that I had been thinking about.

And of course, we ran the first-ever !!Con West in February, and it was a great success! We filled up Baskin Auditorium with excited and engaged people, a lot of whom said nice things about us, and UC Santa Cruz Magazine did a lovely article about us. We also established a new nonprofit foundation (for which I’m currently serving as president of the board of directors) in place of our old LLC, which helps set us up for long-term sustainability and scalability.3

Spring 2019

In spring, it was my turn to teach our undergrad distributed systems course! I was the first person other than Peter to teach this course since his 2016 redesign of it, which had proven popular; he had taught it five times since then. (Prior to that, it had not been taught since 2008.) I had 84 students — the course was supposed to be capped at 80, but I let a few extra students in.

Although I used Peter’s course design in the sense that I covered more or less the same topics he did (and adopted his practice of writing on the board instead of using slides), I didn’t use any pre-made course materials from him, because those materials did not exist. Instead, I took the notes I’d taken from his course in the fall and turned them into lecture notes for myself. I also looked at lots of other people’s distributed systems lecture notes on the web (in some cases, discovering that they had all been cribbing from each other, too). More than once, I had to go back to the original paper about something in order to really understand it well enough to teach it. As always, teaching is the best way to learn — there’s a whole laundry list of distributed systems concepts that I really wasn’t particularly solid with before teaching this course, but that I feel very comfortable explaining now. I had particularly been dreading having to teach Paxos — I actually thought about leaving it out of the course entirely (which I think would have been a justifiable pedagogical choice, actually), but I’m really glad I made myself go through with it!

The course had a challenging project in which students worked in teams to implement a replicated, sharded, causally consistent key-value store, packaged as a Docker container that exposes a particular REST API. Teams could use any language or framework they wanted; we (where by “we”, I mostly mean my superhuman TA, Reza NasiriGerdeh) tested their work by making HTTP requests to the API and checking the responses. I didn’t actually implement the entire course project myself, which I consider to be a personal failing. I started working on a Haskell implementation using the Servant framework, but only got as far as implementing the first two of the five assignments. (I finally implemented the third assignment over the summer, with help from a more experienced Haskeller who gave me some excellent refactoring advice.)

I did a few things that weren’t part of Peter’s previous offerings of the course. One was that I had an optional “creative project” assignment where students made zines about distributed systems topics; I wrote at length about this back in June. Another new thing I did was to assemble a crew of badass undergrads (and one badass MS student) who had been active participants in Peter’s course last fall, and officially hire them as tutors and “readers” (UCSC jargon for graders) for the course. This was honestly as much for my own benefit as it was for the students. I liked having an actual course staff — a real team of people, instead of just one TA (even though my one TA was, as mentioned above, superhuman). (It’s a travesty that I had only one TA for over 80 students, but that’s how massively oversubscribed our undergrad courses are.) My badass undergrad crew immediately set up a Slack for us to use in running the course, and I used it often to get feedback from them on the assignment specs and exams. They also helped me understand a lot of things about the UCSC undergrad experience that I had previously been ignorant of.

Finally, I brought in a couple of guest lecturers. The students were really into the guest lectures, and it gave me a bit of a break from having to prepare lectures myself. Deniz Altınbüken (of “Paxos Made Moderately Complex” fame) came to talk about replication and consensus, and Jacob Repp came to talk about message routing in games (which was the result of a months-long email exchange that had started with one of his colleagues at Blizzard Entertainment cold-emailing me a while back to talk about LVars, which still blows my mind). Both Deniz and Jacob gave great talks and were incredibly generous with their time, and Jacob and his Blizzard colleague Soe, a professional Overwatch commentator with a massive internet following, even hung out and playtested student games. (And then I got to visit Blizzard and give a talk myself recently, which was really fun!) I hope this makes me a Cool Professor.

In short, the course went well. Several students from the course approached me about doing undergrad research, and at least one will be working with me this year. One of the most gratifying things I kept hearing from multiple students was that they hadn’t been sure what area of CS they wanted to specialize in, but after taking my class, they had decided they were most excited about distributed systems! Some graduating students even expressed interest in coming back and doing a Ph.D. with me eventually. Which brings me to…

…recruiting. Unfortunately, none of the four Ph.D. students that I had tried to recruit in winter decided to come to UCSC. (Not unfortunate for them, of course, because I think they all ended up in wonderful places where they’ll do well! Kind of unfortunate for me and UCSC, though.) The good news is that a couple of very strong MS admits are joining my group, and they may eventually join the Ph.D. program.

In May, I went to give my talk at SNAPL, which has a fun and effective discussion format. I had a great time. I didn’t know what SNAPL would be like, but it was a really great couple of days, densely packed with good people and good conversations. I felt like I was among my people — it seemed as though folks were excited about the kind of intersection-of-PL-and-systems-work that I want to do. The only bad thing about SNAPL this year was that, unlike in 2015 and 2017, it was held on the other side of the country instead of in my backyard, so I had to skip a day of class to go. (It was only by taking a red-eye flight that left right after class — which was, of course, delayed, almost causing me to be late for my own talk — that I managed to avoid skipping two consecutive days of class. West-coast-to-east-coast trips suck.)

As a result of conversations had at SNAPL, Mike Hicks asked me to update my “My first fifteen compilers” blog post from a couple years back and publish the updated version on the shiny new SIGPLAN blog. I think the new version of the post is an improvement, thanks to Mike’s thoughtful shepherding — there’s less complaining about the word “transpiler”, and more about the actual advantages of the nanopass framework and of back-to-front compiler development. (I’m still happy to complain about the word “transpiler” in person if anyone is interested.)

I capped off the spring term by going to my first-ever Shonan Meeting — on “Programming Language Support for Data-Intensive Applications” — which was an absolute delight. After the busy spring, it was a gift to be able to spend a peaceful week on a green hillside in Japan talking shop with many of my favorite people, and I can’t think of a better way to have ended my first year as a professor.

Looking back; looking ahead

The TL;DR of how my first year went:

  • I think research went okay, although I didn’t make as much headway as I wanted to this year. I have ambitious plans that I’m not sure how to realize. Now that I have some students, though, we can start just trying stuff and seeing what happens, and I’m really excited about that.
  • I was disappointed by how my Ph.D. student recruiting went. I’m thrilled about the undergrad and MS students I do have, though, and I’m excited about my recruiting prospects for next year.
  • I’m zero for one on fundraising, which is really bad, and so in the coming year I need to prioritize it. My startup funding will support my students for now, but as much as I want more students (and for my existing students to stay and become Ph.D. students), it would be irresponsible of me to recruit more students without also finding ways to support them.
  • Teaching went well, I learned a ton from teaching, and I’m especially happy to have made personal connections with many students, including some who I’m now working with more closely on research.
  • !!Con West went great. It’s wonderful to be in an environment where I can say that !!Con is “an integral part of my outreach mission” and feel like I’m being understood and supported in that mission.
  • I want to get more into the technical weeds on various projects. Happily, this summer, I’ve finally had time to put my head down and write some code, as well as figure out things like why my Docker builds were taking so long (and get them to not take quite so long).

I’m busier than I’ve ever been in my life, and I’ve had to say no to more things than ever.4 But, although I have way too many things to do, most of the things I have to do are awesome! I adore my job. UCSC students amaze me with their boundless energy, optimism, and inquisitiveness, and my colleagues are a joy to work with. I’m especially excited for the next few years at UCSC as we welcome new faculty and students to the department and figure out, together, how to continually create the department we want to be.

  1. Actually, now isn’t a good time. I have lecture notes to make for a topic that I’ve never taught about before. I have a grant application due imminently that needs work. I need to begin rounding up funding for !!Con, update the !!Con website, update my own website, send a bunch of information to a new student, and hopefully clean up the more disgusting parts of the lab before the new students move in. But, hey, it’s not like I’m going to be less busy any time soon.

  2. You could argue that doing introductory data science MOOCs wasn’t a good use of my time in my first three months as a professor. You would probably win that argument. I have to say, though, that from my vantage point a year later, as I get ready to teach an introductory undergrad course this winter that will also use Jupyter notebooks, I’m really glad that I got to see firsthand how the Berkeley Data 8 course pedagogy and infrastructure works, and that experience has made me better able to productively contribute to a discussion with my colleagues about how we should run our own intro courses this year. Moreover, taking Data 8X took me down a mathematical rabbit hole that eventually got one of my blog posts about linear regression added to the syllabus for a course on “probability, statistics, and computational reasoning (including elementary programming) for law students” taught by Paul Gowder at Iowa Law, so that was cool.

  3. 2019 has been a big year for !!Con – it’s the first year we’ve run two conferences (!!Con Original Flavor ran, as usual, in New York in May). That said, for the first time ever this year, I was not directly involved with the organization of !!Con in New York. This is a good thing – three years ago, I said that I thought it would be a victory if I could stop being involved with !!Con and for it to continue to grow and get better without me, and although I’m still involved with the umbrella organization, I’m very happy that as of this year, !!Con New York is indeed thriving without any direct involvement from me!

  4. There was a time, not all that many years ago, that I thought it would be a true sign of having Made It if I got asked to be on the so-and-so program committee or invited to the thus-and-such meeting, and I never would have imagined saying no to such an invitation. Yet, this year I had to say no to multiple such invitations because I already knew I was going to be overburdened.

Comments