RSS feed for all articles.

On Evidence in Learning Programming

On Evidence in Learning Programming

As we build a new digital and computational program (Digital and Computational Studies, or DCS) at Bates, I wrestle with a simple question: what role should evidence play in the design and development of DCS? Specifically, today, my reflection will ultimately focus on the language and tools we use to introduce students to computing. However, I will begin with a short digression regarding teaching in the classroom, because most people will be familiar with the experience of “taking in a class,” whether it is at the primary, secondary, or post-secondary level.

Evidence-Based Practices In the Classroom

Next fall I will be teaching a CURE: a Course-based Undergraduate Research Experience. This is a course that is structured around students engaging in research inquiry: we will attempt to answer a question the answer to which is unknown (and, ideally, of interest to people other than just us). There is evidence in the literature regarding the value of research experiences for undergraduates, which is why these kinds of experiences are being pushed into the student’s curricular experiences. There are also questions we still do not know the answer to regarding the efficacy of CUREs as an instructional vehicle. (That makes sense; there is a great deal we do not know about teaching and learning, so to say there are unknowns or things yet to learn anywhere in the space of teaching and learning is, in truth, not meant to be a stone thrown.)

There are, however, many things I need to keep track of in the classroom if I value evidence in my practice as an educator. For example, my classroom practice—how I interact with my students—is a critical space for me to focus. Just a few examples:

  • When I ask questions to the class, I should make sure I count (one Mississippi, two Mississippi... to roughly 10 seconds), and give students time to think.
  • Or, perhaps I have my students think, then pair up and discuss, and then share out. This lets them explore ideas in a small group before hazarding the (to some, intimidating) sharing of ideas in a large group.
  • I should randomize my selection of students using an external aid—perhaps a deck of cards with their names on the cards—so that I don't make a habit of calling on only women in the class, or men, and so on.

(As an aside, regarding the last bullet… I had a colleague who learned she only called on students on the left-hand side of the class… and she only learned that after years of teaching because she allowed her classroom to be videoed. It was an ingrained habit that was invisible to her, and clearly left the right-hand side of the room out of every conversation she facilitated in her class.)

This list of evidence-based practices is actually small; there’s 20-30 I could keep track of, and at this point in my career, I use many of them on a regular basis. There are still practices I don’t use on a reflexive basis, and that’s something to continue working on. (And, I don’t even track their use in a way that I could consider evidence if I was going to write up a report on the work that I do in the classroom.) In other words, I am aware of places in my practice where my engagement with students can still be improved based on evidence-based practices, and the amount of work I would need to do in order to communicate that evidence to others.

Evidence-Based Practices in Teaching Programming

All of this, however, leads up to a space that has been invariably difficult in every program and department I have ever taught in: the choice of the programming language we use to teach novice programmers. In truth, it is more complex than “just the language.” We need to consider:

  • the tools we use to program in that language
  • the computing environment that exists around those tools (be it UNIX, Windows, Mac, or the WWW)
  • the text(s) that support the learning of those tools.
  • the resources available in the community-at-large (video, weblogs, etc.) from learners and practitioners to support ongoing exploration and multiple perspectives
  • support for transitions to/from the language and tools
  • support from colleagues across campus in the foundational choices being made outside of their department(s)

The list can get very long. My point here that these are complex tools, that involve complex ideas at every level, and that complexity is a cross product of tools, languages, environment, support resources, and the socio-cultural context of the institution, meaning the complexity is (in no small part) a result of the system of considerations that need to be made, and not just any one dimension. It is often the case that our literature regarding novice programmers fails to peel apart this complexity, or worse, fails to engage in good scholarship, and instead appeals authority as a rationale for our actions when it comes to teaching novices.

On Authority and Evidence

When it comes to discussing these languages and tools, it is commonplace for computer scientists (and any practitioners who work in computational spaces) to appeal to authority when making decisions about how (and what, and why) to teach programming. That authority might be in the literature, but more often than not, it is personal authority (years of experience), or a limited set of experiences with a particular environment or book (but no evidentiary inquiry), or (perhaps the most dangerous) an appeal to the current marketplace: what is “popular” right now with employers in the post-graduate marketplace, as opposed to what tools are best for introducing students to the learning of computing and programming.

My colleague Mark Guzdial, recently moved to Michigan State from Georgia Tech, wrote a piece for the Communications of the ACM that began to explore the idea of authority and evidence in the teaching of programming. His article was essentially exploring two themes.

  1. One theme of Mark's article was to rebut the myopic and sexist perspectives in an article that was making the rounds at the time. It is important that Mark engaged in this rebuttal, but I don't want to give more oxygen to the small-minded and rediculous belief that women—simply because they are women—cannot excel in computing. There has never been any evidence of this, nor will there be. This is an important theme unto itself, I agree 100% with Mark that there is nothing in the learning sciences literature that even remotely suggests any biological/physiological difference between human beings when it comes to learning programming, and it is not the thread of my argument here.
  2. The second theme of Mark's article was the preoccupation of computing, as a discipline, to appeal to authority. I want to explore this further.

This came up just yesterday (November 14th, 2018) on a disciplinary mailing list; in particular, the question was asked:

I’m teaching an intro to programming class this coming spring for students with zero background in coding. I plan to use Python to ease them into the basic programming concepts (not sure about the IDE yet), and then transition to Visual Basic to give them access to a nice GUI builder and also the ability to use some of these skill for possible scripting in MS Office or other automation tasks. The second language also serves to demonstrate how much of the knowledge learned in one language can transfer to another.
Finally, if anyone would be willing to share their syllabus, or project ideas that were highly engaging and fun for students in a similar course I would be very appreciative. Right now I'm thinking data manipulation/analysis type tasks mostly for Python, while VB and the GUI might be nice for some small utility or db type programs perhaps - open to suggestions.

There’s so much to unpack in this question. I won’t do it justice, but I’ll try and summarize the key issues.

  1. Language Choice. What rationale does the asker have for using Python? What evidence is there to support its use in the classroom? They do go on to mention that the rest of the curriculum is taught in Java...
  2. Tools. The asker has no idea what tools they will use for teaching Python... yet, tools matter a great deal when learning to program. We'll come back to this.
  3. Multiple Languages. What rationale does the asker have for using two (very) different programming languages in a 15-week span of time?
  4. Motivation. The asker suggests that they want "fun" projects. What does the asker mean by "fun?" How does this relate to their goals and outcomes for the course, and (more broadly) for their department and institution?

There is more to unpack in those two paragraphs, but this is a starting point that gets to the core issues and challenges I see in using evidence-based practices in the first teaching of programming at the college level. Mark responded to the thread (referencing his previous CACM article), and reminded us of some importing points (which I paraphrase/expand on here):

  • The language matters. It shapes how students think about what they are doing, there are languages that are easier to learn than others (because they were designed, intentionally for learners), and that we can study this (and have).
  • The UNIX command-line is not simple. It was developed by experts for experts. There are many HCI design principles that are not at work in the UNIX command line. It is effectively a language unto itself, and therefore should be treated as a complex learning space just like the act of programming itself.
  • Professional programming environments are too complex. Environments like R Studio, which is a popular choice (or nearly the only choice) for writing R scripts for data analysis was designed by and for experts. (Actually, it is unclear whether the people who developed R were expert software developers with any knowledge of usability. They may have been biologists who learned to write code.)
  • There are programming environments designed for novices. There are environments like BlueJ and Dr. Racket, MakeCode, and Scratch, and App Inventor (to name a few) that are designed, top-to-bottom, with the beginner in mind. We have good research about (some/most) of these environments, and we have empirical evidence they make a difference in the learning our students engage in, the ability for our students to retain that learning, and their desire to keep on taking courses with us and continue learning more.

We can dive deep into any of these dimensions, but I want to continue to pause on the original question posed on the SIGCSE mailing list: what language do I choose? In particular, I’m going to reflect briefly on the kinds of pressures we often feel as educators in an institutional context when it comes to these kinds of decisions.

Language Choice: Pressures

The rationales for language choice are often motivated by pressures from colleagues, students, and the marketplace. I want to consider each of these briefly.

The marketplace is fickle: every few years, something new is “hot,” and “the thing to learn.” Currently, the flavor-of-the-week might be Google’s Go, which is intended to be a concurrent answer to systems programming languages like C. Or, perhaps it isn’t a language, but instead “machine learning,” suggesting that it is important to know how to use Tensor Flow (a library for doing machine learning work), or some other tool that was just released last week that I haven’t heard about yet. Either way, the marketplace has nothing to do with the teaching of people who have never written code before; it is the space of experts who spend 40+/hours week on their task, and have the time to master complex, and sometimes rapidly changing, tools.

While I have a great deal of respect for my students, the few who have strong opinions about what language we should be using probably have had minimal experience using the tools they profess would be best. Or, they have read a blog about the most recent Thing to appear in the marketplace, and therefore they believe that is critical for us to learn. Students do not walk into Calculus and insist we use some new notation; they expect the Leibniz notation to be used (if they have any expectations at all), and that’s that. But they walk into courses involving programming full of ideas. That’s wonderful, but it isn’t evidence.

Colleagues know the tools they know. They’re generally overworked, and rarely have interest in learning new tools. From their perspective—especially if your course is a “feeder” to their courses—it would be best if your course taught the tools they are using. It does not matter if your institution has faculty using multiple tools… any one colleague will want your students to learn the tool they use. The choice of tool that your colleague uses is rarely evidence based, but instead is what their research group used, or what they learned as an undergraduate, or what the marketplace is currently centered on within their discipline.

At Bates, we use STATA in Economics (and some R), R in Politics, SPSS is used in Psychology, Python and Matlab in Mathematics and Physics (and probably some C/C++), and Isadora and Max/MSP (amongst other programmatic tools for multimedia work) in Art/Music/Dance. No one is casually prepared to retool their teaching or research, but it is probably the case that most faculty would prefer that, if there is going to be an introduction to computation and programming, that it would prepare students for their particular flavor of computation and programming. The fact that these are radically different contexts, with radically different tools being used is generally secondary in the thinking of any one faculty member or department.

If it was so simple as to make an evidence-based choice, I would likely ground students’ experiences in a block-based environment in a first course, and have two courses that further introduced them to the structured approach to programming that is epitomized in How to Design Programs, which anchors the (evidence-based) Bootstrap curriculum (for middle-school learners) and a design-centric approach to software construction at the college level. However, these choices (when made in a department or on campus) tend to be political and negotiated, and it isn’t clear that the notion of research and evidence necessarily is enough to convince colleagues that the tools and environments they know might not be the right tools and environments for their students when their students are taking their first steps on a journey that the faculty took so long ago, they’ve forgotten what it was like.

Language Choice: Evidence

Weintrop and Wilensky recently published a marvelous study of 4000+ students and their first learning of programming using block-based languages. Their question was the following:

How does block-based programming compare to text-based programming in highschool introductory computer science classes with respect to learning outcomes, attitudes, and interest in the field of computer science?

The paper is worth a read. The essence of their results is that students gained more confidence pre/post with block-based environments, demonstrated greater learning gains on content using block-based environments, enjoyed themselves more (block-based), and were substantially more interested in taking further computing courses.

There are few other programming languages and environments that have a body of research around them that is coherent and evidentiary. BlueJ has scholarship around its objects-first approach, including the very coherent STREAM process that Michael Caspersen and Michael Kolling have published (which effectively represents a culmination—though not a stoppoing point—of this line of work). A great deal of research undergirds the development of the Racket programming language, its associated (free) text How To Design Programs, and the tower of languages that are provided to support learners (from the Beginner language, to the Intermediate language, and so on)—each of which was designed, based on evidence from use, to support learners from the syntax and structure through to the kinds of errors they can experience. Kathi Fisler’s work around the Rainfall problem (The Recurring Rainfall Problem, Sometimes Rainfall Accumlulates) are studies that capture the current state of inquiry around this ecosystem of language and environment that has seen continuous use, development, and study for over 20 years. (Arguably, because Racket is a close design descendant of Scheme, we have been studying these tools and their use with students since the late 1960’s.)

In Closing: What To Do?

Sally Fincher et al. looked at how we, as educators, change our practice. In their paper Stories of Change: How Educators Change their Practice, they asked 99 educators (mostly computer science educators or closely related) to address the following question:

Can you think of a time when something—an event, an article, a conversation, a reflection, an idea, a meeting, a plan—caused you to make a change in your teaching? What was it? What happened?

The work led them to the following result:

Of the 99 change stories analyzed, only three demonstrate an active search for new practices or materials on the part of teachers, and published materials were consulted in just eight of the stories. Most of the changes occurred locally, without input from outside sources, or involved only personal interaction with other educators.

Bringing this all the way back from the global to the local, I would claim Fincher’s article should give us pause as we develop a new computational program at Bates. The article raises difficult questions regarding the role of evidence in the design and development of courses, our choices of tools and languages in teaching computing, and how we engage across disciplinary boundaries as we engage in the design and development of a new computational program at Bates.

Perhaps, through intentional design, and a willingness to commit to new learning on the part of ourselves and our colleagues (an expensive proposition in time), we might decide that evidence matters. However, we might also decide that the evidence is not “good enough,” in which case we will help ourselves feel comfortable doing what we “know best,” because we decide the evidence is not of sufficient quality or rigor. In other words, it is easy for all of us to make the comfortable choice of privileging our own knowledge and expertise, making a kind of “internal appeal to authority” when faced with change or the unknown.

I believe the most dangerous reason to make choices is because because we are in a hurry. If we rush, we are unlikely to actually explore and discuss evidence-based practices in computing, and will instead “just teach Python and R,” because it is a safe set of choices in the current climate, both campus and in the marketplace. (These languages are, after all, the languages of machine learning and data science!) But neither of these tools have a rich base of evidentiary research in the novice programming context, and both lack infrastructure to scaffold the learner well. We could build that infrastructure, and develop the associated research… but that, itself, is a monumental undertaking.

In short, as a computer scientist and computing education researcher who cares deeply about understanding the what, the why, and the how of my teaching… I’m uncertain what is the best course of action when it comes to engaging in what feels very much like a campus-wide (or certainly multiple-department) dialogue around the teaching and learning of programming. How (and, even, if) to advance the state of evidence, and weather the attendant questioning and attacks, is hard.

The question is, in short, should evidence play a role in language and tool choice as we design a new digital and computational program at Bates? I feel like I know how I would want to engage with that question, but that is different than what the department or even community might want to engage its time and energies.

On Evidence in Learning Programming 20181115

Rebuilding the TVM

Rebuilding the TVM

Over at Hackaday, there was a Retrotechtacular post on the Transputer that generated a fair thread of comments and conversation. I put out a call for help to revitalize the occam toolchain a bit.

For the colleagues who replied, here’s a short history/roadmap.

I will Sum Up

The full story is long. We have a codebase that includes the original occ21 occam compiler from Inmos Ltd., which has been extended over the years to include not just the core occam2.1 language, but also extensions for process mobility; this language is called occam-π.

Multiple dissertations grew out of the codebase over the years. It is safe to say that the fact that the language lives at all is because of the dedication, over many years, of Prof Peter Welch (and others) at the University of Kent.

The Repository

The main repository includes a toolchain for a language, its standard libraries, and multiple runtimes.

  • Under tools, you will find the compiler, the linker, the documentation builder, and other tools that hopefully don’t factor in right now.
  • The runtime directory includes the Intel native runtime as well as the TVM, or Transterpreter Virtual Machine, which is the portable runtime. It is implemented as an ANSI C library that is intended to be statically compiled and linked into a wrapper for any given embedded target. In a word, it is an overgrown bytecode interpreter and scheduler.
  • The tvm directory contains the wrappers for the VM. When building on big platforms, the POSIX wrapper is used; “big” generally means “the platform casually builds most programs against a POSIX API.” The Arduino wrapper provides the hardware-to-VM bindings for the ATmega series of processors.


Because of extensive tests built into the occam compiler toolchain, we are confident in the VM. Yes, you can write code that crashes; a division by zero is still bad. However, we are extremely confident in the scheduler and runtime; given good input (eg. a compiled occam21 program), the VM works.

For a momentary dive into the wrappers, you might begin by looking at the TVM runloop in tvm.c. It:

and does that forever.

The wrapper declares some constants that lay out the size and location of RAM (which we provide an overlay for; the VM runs fine on big- and little-endian hardware), and hooks for getting the time 1, 2. We provide an FFI table for linking occam programs to native code (a tricky dance), and an interrupt mapping table. The interrupt mapping lifts hardware interrupts into the occam runtime; while this does mean that we are lifting a real-time event into a soft-realtime context (a “drawback” for some developers), it also means that we are lifting random, unpredictable behaviors into a well-reasoned framework for parallel and concurrent programming. If you are an engineer who needs real time software, this is not it. If you are an engineer who needs correct, concurrent execution of code on embedded devices, and can live with soft real-time, then I have found occam on the TVM to be a joy to develop with.

(The curious can see some slides that cover this work, as well as a paper.)

The VM was never well integrated into the build for cross-compilation targets. We typically drove much of this from one or more programming environments (eg. a plugin or similar for an IDE), and we would simply plunk the compiled VM down, and ship bytecode to it. As a result, the build for embedded targets was never a first-class citizen of the project.

The Build

The build currently builds everything.

For embedded work, we really only need:

  • The TVM runtime, cross-compiled for the target architecture.
  • A wrapper, cross-compiled, and linked against libtvm.a.

There’s often a bit of a dance to then get things right for the platform. For example, we had a complex way of doing bytecode upload to the ATmega series of processors; you could upload the VM once, and then upload just the bytecode repeatedly. I personally think that it is now easier just to:

  1. Compile occam21 programs to bytecode.
  2. Let the linker spit out a .h file containing the bytecode in an array.
  3. Link the code into the wrapper/TVM.

This gives you a single binary for shipping to the target platform. Doing anything fancier would need to be justified by some use case. Given that the VM is around 20K when compiled for the 328p, and bytecode is tiny (it is Huffman encoded), then it is far, far smaller than many of the 500K monstrosities that are being shipped as part of Python and Javascript runtimes for ARM cores (eg. the Circuit Python Express, or the BBC Micro:Bit).

The compiler does not need to be cross-compiled. I have (in the past) had it running behind a web service, so that the end-user can write their code locally, the code is compiled “in the cloud,” and a .hex is shipped back. I would, today, do this in a Docker container, so it could be run locally (by an expert developer), or be used remotely (by students, casual users). I am happy to bring this back to life.

So, in short:

  1. Toolchain. GCC has moved on since the project was under active development. Some cleanup may be necessary, but there should be no reason for things to have completely gone to hell in a handbasket. Or, pehaps bitrot can be that bad.

  2. VM. The VM needs to be well supported in terms of adding cross-compilation build targets.

  3. Wrappers. At the least, a wrapper for a modern target needs to be developed. This is not hard, as evidenced by the ATmega wrapper—which probably serves as a good roadmap, and probably had to jump through more hoops than other plaforms will require, as the libc for the AVRs is 1) limited and 2) the mixed memory architecture of the device made some things more complex than necessary.

I personally think there must be a way to pull out the necessary pieces for embedded development, so as to simplify the process. By “must be a way,” I mean “a way of reorganizing the repository so it is not monolithic.” Perhaps this means “I think we should use git modules or similar to reorganize the repository.”

I’m happy to discuss any branching/reorganization of the repositories that gets us to a goal of easily building occam and the TVM for more embedded targets. I really do prefer this language for environmental sensor work over any variant of C—especially when it comes to teaching students to reason about concurrent systems.

Native vs. VM

The “native” runtime involves a 100KLOC+ of C that, to get fast context switch times, plays games with the Intel processor’s stack in assembler. It is, in my opinion, the wrong idea to ever touch this code, or imagine targeting any other processor. This will be slow and error prone over time, as each new architecture may require subtle work in the runtime.

The TVM “just works” anywhere you can compile ANSI C. It’s slower, but honestly, I care about correctness, and ease of portability/maintenance. At roughly 10KLOC, it’s a lot more manageable.

Notes about Targets / Random Ideas

We have targeted builds for the TVM at devices like the LEGO NXT in the past. To do so, we first targeted a runtime; this meant we built on top of a small OS. If an embedded target has FreeRTOS, it is completely reasonable to target a wrapper for the TVM that builds on top of FreeRTOS. Loading bytecode can then be more easily accomplished on that plaform—because you’re no longer working on the bare metal, but instead have an API between the VM and the hardware. It would also mean that builds to new architectures that support (say) FreeRTOS would be one-and-done, as we’ve targeted an abstraction (as opposed to having to develop a new wrapper for each new target).

This is what we do for the POSIX interface. It sometimes complexifies the interrupt handling… for example, we had to do a fair bit of work to make screen/keyboard handling work on POSIX platforms, but for embedded targets (even with a lift of something like QNX or FreeRTOS) it will likely look more like the Arduino wrapper than anything else.

I have also wondered about Would it be good to “port” the build of the VM there? Would this provide some “lift” for targeting future embedded targets? The library should “just work” across multiple platforms, and each would need a wrapper. Once that was done, it becomes a matter of tying in a call to compile the occam code.


That was long, but hopefully provides a roadmap. The project has had many hands over many, many years—going back to the Inmos engineers who did the original work on Transputer and the language itself. I believe the toolchain has value, especially for teaching students how to reason about concurrency and parallelism on engaging and real hardware. The transition (in my experience) from occam to languages like Go or Erlang, to reasoning about concurrency in Javascript (everything is a callback), or even architecting concurrent and parallel code using semaphores and threads—all of those things are easier when you have a coherent model for reasoning about concurrent systems.

Being a college professor, helping launch a new department, takes time. However, if there are people who are keen to help bring this project back to life, it will provide motivation to support those people and help make it a reality.

Debugging I2C

Debugging I2C

The I2CDriver is a new product from James Bowman of Excamera Labs. It is pictured at right. I’m going to get around to talking about it shortly.

Our controller board has two components on it: a processor and a clock. The processor talks to the clock using a two-wire protocol called I2C. I2C allows a controller to talk to peripherals using a low-speed serial protocol. More importantly, it allows for multiple peripherals to be on the same bus. Here, “bus” means “everyone is talking on the same wires,” but each peripheral has a unique address, so that they only listen for (and respond to) their own messages.

On our controller board, we can do things like set the clock by sending I2C messages to the clock chip. Or, we can plug additional layers into our sensor stack, and use I2C to talk to other peripherals on other layers. Most recently, we have been working on a board that has only one component on it: a barometric pressure and temperature sensor.

The sensor is small; while it looks huge (at right), it is actually only around 6mm in diameter. It has 8 connections underneath it, and two of them are for a processor to talk to it using I2C.

We have tried, repeatedly, to build boards that use this sensor. We have assembled three, and each one failed to work. We put some code on our controller, ran it, and nothing worked. We have one board with the same sensor from Sparkfun, and when we run our code, it works perfectly. So, we knew something was wrong with our board.

One of our first steps was to reflow the board. We got out our trusty air rework station, held the part in place, and carefully heated the pads. The solder reflowed under the pads, further securing the sensor to the board we designed. We tested continuity—making sure that the sensor is actually connected, electrically, to the controller—and everything tested out.

Once we did this, we plugged in our I2CDriver. This allowed us to send individual I2C messages to the sensor. Based on our reading of the datasheet, we should probably send a reset when the sensor first powers up. So, we did.

i2ccl /dev/cu.usbserial w 0x76 0x1E

This command uses the i2ccl program (which James Bowman provides with his I2CDriver) to write the message 0x1E (which is hex, and represents the pattern 0001110 in binary) to the I2C device that has address 0x76 (which our sensor does).


This was exciting. We had a sensor that worked! It hadn’t worked before! Clearly, it was our soldering!

So, we sent another command. This command told the sensor to measure the temperature using it’s build-in analog to digital converter (ADC).

i2ccl /dev/cu.usbserial w 0x76 0x48


This was awesome. We’ve been trying to get these sensors to work for weeks. We were feeling awesome. So, we then told the sensor to set its read pointer to memory location 0x00, which is where the result of the temperature read is stored on the sensor. We had to do this in order to read the temperature value out.

i2ccl /dev/cu.usbserial w 0x76 0x00

This caused i2ccl to crash. (I need to figure out why, and submit a bug report.)

We tried this multiple times, and every time, it caused it to crash. The same sequence worked fine on the board from Sparkfun, but it did not work on our board.

After roughly 1.5 hours of investigation, Maddie and I had to move on. Later that day, however, I decided to stare at the PCB layout again. Our boards have two GND pins, and two 3.3V connections; this is to simplify (arguably) our layout on individual layers of the sensor.

We had connected GND to GND, but we had not connected 3.3V to 3.3V. Why? Because we knew they were connected on the controller board.

But, we were testing without the controller board.

So, as a result, our sensor was not getting power.

However, when we plugged in the I2CDriver, our sensor was getting just enough power from the I2C lines themselves that it was able to turn on and respond to some low-power commands (like reset). However, asking it to take a measurement put it into a mode where it required roughly 1.5mA. This is a very small amount of current… but, it was more than could be drawn “parasitically” from the I2C pins. And, therefore, it was probably enough to kill the processor that is inside of the sensor. As a result, our sensor “crashed” whenever we asked it to take a reading, or worse, it “crashed part way,” and which would put it in a state of sending garbage back to the I2CDriver after we attempted a read.

To be honest, I don’t know whether our sensor crashed “all the way” when we asked it to do a temperature reading without enough current avaialable, or if it ended up in some half-way, indeterminate state… my guess is, based on how it failed, that we managed to only partially crash the sensor. I suppose we might say that the sensor was only mostly dead.


Either way, figuring out what went wrong took a fair bit of debugging to find the error in our board. In the sensor stack, it wouldn’t have been a problem. On the breadboard, while testing, it was a huge problem.

The fix? Connect 3.3V to 3.3V on the layers of the sensor, and all will be fine in the stack and in testing.

The takeaways of this story:

  1. Debugging hardware is really hard. You have limited tools to debug what is going on, and when it comes to digital communications, you need a logic analyzer of some sort. We have both a Saleae Logic 8 Pro in the lab, and now the I2CDriver. We have used both in our work on these sensors, and they’re both invaluable.

  2. Debugging hardware takes patience. You have to read documentation carefully, test and probe systematically, and question every single thing about your design and build. There is nothing that can be taken as a given.

  3. Debugging hardware is a game of constant learning. I have been learning for the past 10 years the ways that hardware can fail, the mistakes you can make in a board design, and how to use the tools I have to debug problems when they arise. I will have to keep learning, because I expect I will keep making new mistakes.

I was, however, very glad to have the I2CDriver. At just under $25, it is a no-brainer tool to have on the bench. I probably could have debugged this with the Saleae, but being able to use the I2CDriver to send commands one-at-a-time was what let us figure out exactly where the failure was. I’m tempted to get a second for the lab, just so I can leave one connected up to a test/dev machine at all times.

Our next step is to get our battery charging circuit working. I have no idea what will be wrong with it, or how we will debug it. But, no doubt, something will be wrong…

Motivation for a New Embedded Formfactor

Motivation for a New Embedded Formfactor

I want to revisit a fundamental design premise explicitly for the purpose of soliciting feedback. If you’ve got some background in this space, please drop me a note (mjadud at bates dot edu or @jadudm on Twitter). My students and I would welcome your input

I want a low-cost environmental sensing solution. My design criteria/constraints:

  • BASE COST. It should be possible for me to put a bare-bones sensor together (eg. a controller and enclosure) for less than $10.
  • ENCLOSURE. The enclosure should be submersible, and able to be built easily with COTS components that can be sourced at a typical US hardware store.
  • MODULAR. It should be possible for me to build a sensor by adding what I need, and nothing more. For example, if I need WiFi, I should be able to add it; it should not be present “by default” (thus adding cost, complexity). This goes for all aspects of the design: sensing, power, timekeeping, storage, and so on.
  • ABSTRACTED. Modular layers should be abstracted in hardware and software for ease of use. For example, every hardware component in the sensor stack should implement a common “goToSleep()” command; it should not be different on a per-layer basis, nor should it be something that the programmer needs to look up for different kinds of components. Common interfaces should be common, even if they turn out to be a “no-op” for some classes of hardware.
  • BATTERY-FIRST. Everything should be designed with the thought that we will be powering our board with a budget of 1000-2000mAh of power, and that we want to last at least 6mo to 1yr on that budget for most applications. At the least, we want to last a summer season.
  • OPEN. The hardware and software should be free and open.

The current thinking is that we will design against our enclosure choice, which is 2” PVC. It is cheap, commonly available everywhere (it is in most every home and building in the USA), and it is easy to work with (glue, drill, cut, etc.). Smaller-diameter tubing makes design hard, and larger gets expensive. An enclosure like the mockup below costs around $4.

This design choice restricts our selection of electronics hardware.

  • Adafruit Feathers. The feather is a 2” board, which will not fit in a 2” diameter cylinder. While it could be rotated 90°, this limits the height of the stack. I also worry about power consumption/deep-sleep on Feather boards in the general case.
  • Sparkfun Qwiic. Sparkfun’s design only specifies the cable parameters, and nothing about the physical design of components. Mounting the free-form modules in an enclosure becomes difficult. I2C as a standard is promising, but boards (generally speaking) lack the modularity and power management features that we require.
  • Seeed Grove. Nothing about Grove is designed with these parameters in mind; they are all modules of different sizes, and the interconnects vary. We do not win on power management and modularity.
  • Pi Hats. There’s nothing in the Raspberry Pi space that is appropriate for this kind of work, given the power consumption on the Pi.

The physical board is currently shaping up to be a 1.8” wide octagon (allowing space for cabling around the board in a 2” diameter cylinder), and two headers spaced 1.4” apart. Header 1 gives us VREG, GND, “wake”, SCL, SDA, and VBATT. Header 2 gives us VREG, GND, SS, MOSI, MISO, and SCK. We get both unregulated and regulated power on each layer (VREG and VBATT), I2C, and SPI. The “wake” pin will be pulled low (or high, TBA), and toggling the line will signal to the entire stack that it should begin wakeup. (This behavior is tentative at the moment.)

(As I look at the board, I worry about plugging this in backwards. I might want to offset one of the headers by a single 0.1” step, so that it is impossible to plug them into each-other “wrong,” but they remain breadboard compatible.)

My concern is that I’ve failed to ask myself enough questions, and that I’m reinventing wheels that don’t need to be reinvented. That said, I don’t believe there’s a platform that I can pull “off the shelf” and use robustly/reliably, semester-after-semester and year-after-year, with my students on a wide variety of (as-of-yet unspecified) environmental sensing projects.

Next Steps

I have three students working with me this summer on this project; we’ll be blogging more about it as we proceed, and we’ll be building prototypes with our Voltera V-One. Hopefully, we’re on a sane path…

Thoughts on Sensor Design

Thoughts on Sensor Design

I’m launching a summer of environmental sensor work, and I want to make sure I’m not reinventing a wheel. Given that I don’t have any close collaborators at this point who are deep into sensor design and development, I’m asking the cloud for feedback and critique.

The Things We’re Sensing: Temperature, Depth

The fundamental sensing questions involve water: temperature, salinity, and depth.

There’s multiple locations that we might want to investigate, and we might have different sensing desires at different locations; for example, in Casco Bay, we might not want salinity, but we absolutely need depth and temperature, while in the salt marshes of the Morse Mountain Conservation Area, we need salinity, because we’re tracking the influx of tides.

Our Budget Target: $20-40

We’re also sensitive to cost: we’d like to have more, rather than fewer, sensors. This means that $50 enclosures are not OK. A $5 enclosure, if possible, would be lovely. Commercial-off-the-shelf salinity and temperature sensors can run anywhere from $1000 - $2500, but they’re not reusable… so, strictly speaking, our budget targets are rediculously high. But, if we want to have broad sensor coverage, or engage in community-engaged environmental sensing projects, we need to get down into the tens-not-hundreds-of-dollars range for our platform.

Our Scale: 10s of Sensors

This suggests, ultimately, the scale. We’d like to be working in the many 10s of sensor range. To scale to hundreds of sensors, we’ll have to think about both the design of our electronics being amenable to automated fabrication, as well as having enclosures produced rather than to hand-made.

Sensor Life: 6 months

Our sensors should last at least 6 months without intervention. This suggests we’re deploying after thaw and before freeze.

It would be cute if they could be recharged without being opened. It isn’t clear this is critical at this time. If it is, we’ll consider sticking a Qii charging receiver inside of the sensor. That, however, suggests a battery recharging wafer as well…

Once retrieved, we don’t want them to be thrown away; so, it should be possible to recharge them. Whether this means it is disassembled and the batteries are charged manually, or there are external charging points, or wireless charging… these are all possibilities. Ultimately, this will be determined in part by scale: for our initial testing, we will likely have 1-5 sensors. For scaling, we’ll need to think about having a way to recharge the sensors without opening them up. (This suggests other design constraints/considerations as well…)

Data: Small (100s of KB), Local, WiFi

In our first target, we’d rather like real-time-ish data. However, the “ish” part means that our sensor will be underwater, and only come up once every two days. Therefore, we need local storage, and we need to squirt over the radio when we’re brought to the surface. In an ideal world, we detect an extended shake (as we’re pulled up from the floor of the bay), and use that as our “wake event” to indicate that we should begin firing up the radio and looking for a base station.

We think we can use a COTS WiFi to cellular bridge module on the boat. Once the sensor is up, it will be above-surface for ample time (tens of minutes) to find the base station and send its current datastore.

We expect to be recording temperature and pressure on a roughly 10-minute cycle. This is a small amount of data; we can leverage either flash/FRAM technologies for storage, or we can use something bigger (uSD), which would allow us to have a filesystem interface (but with a much larger power consumption when we wake up the uSD card to write our data). However, it would be easier to recover the data at the end of a season if we discover that we had radio issues at any point (or multiple points) during the season if it is stored on a removable medium.

Sensor Design

We are intentionally designing our sensor electronics against our enclosure design. That is, the enclosure choices are driving design choices in the electronics.

Enclosure Design: 2” PVC

We are envisioning our enclosure design to be based on 2” PVC. It is easily obtained, easy to assemble, easy to machine/manipulate, and easy to make waterproof. In a perfect world, we’d design against 1.5” or 1” PVC, but for our first show, we’ll start with something “large enough” to be sloppy in our electronic design, but still have sensors “small enough” to be manageable.

Electronic Design: Custom 328P-based

We would like to use COTS components. For example, it would be nice to use Adafruit Feathers, or Sparkfun Quiic components. However, none of these boards are designed for extreme low power consumption; they fundamentally assume a hobbyist who is exploring programming embedded systems. This is not a criticism of these products, but it is not clear that (unmodified) we can say “purchase the 328P Feather and stack it up with…” as a sensor solution. I have to look closely at the Feathers, and do some testing to see “how low they can go” before I claim we cannot use them, but my concern is there is more going on on the board

(Part of “verbalizing” this process is so someone might say “but, Matt, you’re wrong…”)

The idea is to have a stackable set of boards that are just under the 2” inside diameter of PVC. (This drawing is based on the OD, which is something like 2.375”, but it’s close.) I’m proposing the following:

A Main Board

  • CPU. The main board has a 328P. The 328P will run at 3.3V/8MHz, reducing part count and power consumption.
  • Bus Header. A 2x5 pin header will provide stackability, and through that header we will run VBATT, VREG, GND, I2C, and TWI/SPI.
  • ISP. A six-pad (pogo-compatible) connection for the AVR ISP.
  • USB-Serial. A six-pin (90˚ male) header for USB to serial.
  • Status. A status LED.

This would be the “command” layer in a sensor stack. It has no functionality other than to have a CPU. For configurability, every other wafer in the stack has a single function as well, and stacks via the bus header. The rationale is that every single board will either have 1) an I2C-based device, 2) a SPI-based device, or 3) another 328P configured as an I2C listener device. (I’m going to reject the language of “master” and “slave”, and instead use “speaker” and “listener.”)

Note: The header might rather be breadboard compatible, instead of a 2x5? For example, a 1x5 row on the “top” and “bottom” of the board? Each side would want… VBATT, VREG, GND… and then control lines. On one side, SPI, on the other, I2C? Or, duplicate everything on both sides?

A Clock Board

  • Clock. A DS3231M. This has a lower part count than the non-M variant; specifically, it does not require a crystal.
  • Backup Battery. A CR2032 Battery.

Because the 3231 is already an I2C device, we know that, when plugged in, we can tell it to go into its deep sleep mode from the main board.

It can run off the VREG line in the bus header.

Note: This raises a question… should every board be populated so that it might have a regulator? Should that be something that is a “given” in the board design for a wafer, but we don’t choose until we actually populate a board and are laying out a sensing problem? For example, you might discover you must but the clock on VBATT with a regulator, because you simply have too much going on in your sensor stack to run everything off a single regulator. Or, perhaps (if there are two stacks, for breadboard compat.) we put two regulators on the distribution board, and the user can choose, with a jumper, whether one or both are active. That way, power from one side and power from another… No. There’s no easy way to choose which VREG you get your power from. Each wafter will be designed and implemented, and we don’t want to have jumpers everywhere…

Note: How does the DS3231 wake the controller? We might have to pass a “clock wakeup” GPIO line through the header?

An Analog Sensor Board

  • Sensor Connectors. Three-Pin Connector(s)
  • Listener Config. Address jumpers
  • Listener. ATMega 328P
  • ISP. ISP pads
  • Power. Voltage regulator

Because we want to be able to turn off any analog sensors connected to the stack, we drop a 328 onto this layer. It is configured to listen to the command layer, so that a sensor developer simply plugs in this layer, can issue a “AnalogSensors.wakeUp(),” do a reading via “,” and then tell the sensors to go back to sleep (“AnalogSensors.goToSleep()”). The 328P can source up to 40mA of current per pin, which for our sensors is more than enough (we’re in the 10-20mA range per sensor), so we should be able to source the current for the sensor directly from the processor.

Although code must be developed for this layer, it is primarily implementing an I2C “API” that the command layer will use. Once stabilized, we should never have to modify the analog sensor layer’s firmware, and instead can safely flash it to the board and leave it for the rest of time. If we need configurability (eg. wakeup time on a sensor, etc.), then we can expand, over time, the complexity of the firmware so we have defaults as well as reconfigurability built into the API.

Any sensing application that does not need analog sensors can, then, omit this layer of the stack.

This layer has a voltage regulator that can be optionally chosen via (solder) jumper. This may need to be a common option on many boards; a given stack may be able to run off a single regulator, or we many need multiple regulators in the stack, because peak current for the entire stack might exceeed what a single regulator can provide. (I’m thinking about possible high-current radio situations, for example.)

A uSD Storage Board

  • Listener. ATMega 328P
  • Listener Config. Address jumpers
  • Storage. uSD slot
  • Power. Voltage Regulator

The SparkFun OpenLog is a nice device, but pricy. Also, it stores data over serial; we need an I2C-based device to fit our stack. Ideally, we modify the OpenLog firmware, so that we can have a storage layer where we can wake up, store some data, and put the whole layer to sleep. Because the 328P is power miserly in deep sleep, we can use it (again) as an “API interface” for a common storage protocol for all storage layers.

By placing a 328P on this layer, we eliminate a great deal of code from the command layer. The command layer says “Storage.storeNext([array…])” or similar, and the driver handles squirting everything over I2C to the storage layer. If we develop the API reasonably, then the command layer can be ignorant of whether we are storing to a uSD or to a flash chip or FRAM. In short, we should be able to store sequential sensor readings easily, without worrying the developer about the particular medium they are storing to.

These abstractions will, ultimately, be limiting. But, they will be flexible abstractions. We can always improve the API running on the storage layers. For example, should a developer be able to issue a single command to “Storage.storeNextWithTimestamp([array…])”, or should the programmer be responsible for first getting a timestamp, combining it into a structure for storage, and then sending that to the storage layer? While not yet designed/decided, it is nice to know that these abstractions can be built, and the ultimate goal (of being able to quickly, reliably, program low-power environmental sensors) can be achieved.

Note: The voltage regulator should be one that has an ENABLE line. This way, the 328P can be used to enable/disable power to the uSD card.

A Radio Board

  • Listener. ATMega 328P
  • Listener Config. Address jumpers
  • Radio. An ESP8266.
  • Power. Voltage Regulator

We can go one of two ways on this board: it can have a listener and an ESP8266, or it can have just an ESP8266. The ESP8266 will drop into a reasonably low-power mode, but it is not as miserly as the 328P. As a result, we may want the intermediary, where the “wakeUp()” command will first wake the 328P, and it will then power up the ESP826.

In terms of software design, we would implement the same API (perhaps) on the 328P as a storage device. That way, storing data and sending data look exactly the same. The “smarts” for handling retry/etc. live on the local listener. The 328P, therefore, looks like every other listener: an I2C protocol implementation, and communication with the ESP8266 can be carried out over serial. This then looks like any Arduino sketch that uses an ESP8266 board, and allows us to use a stock firmware on the radio, as opposed to writing a custom controller in Lua or Python on the 8266. (If we later want to redesign this board, we can… but, it might be easier to start this way.)

If we must, we can use an ESP8266-01 module (with the six-pin header), or we can use a 12-E/F (and surface mount it). In other words, we could design this layer incrementally… one where we can prototype with a component we can plug in, and then evolve the design to one that we solder directly onto the board. There may be some question of using an external antenna… which would need to be inside the enclosure, but it could be done regardless.

I am worried about WiFi through the PVC in the field, but again… that’s what testing is for. This could become a LoRa radio, using a Nordic part. However, by using WiFi, we can have a base station that provides WiFi to cellular bridging as a COTS purchase. So… we’re really in a position where we want WiFi to “just work.”

Note: When we want to send data, how do we do it? Does the radio retreive everything since last send? Does the storage layer have “get everything since last transmission” API call? Or, does the controller have to shuffle everything from the storage, to the radio, and keep track of these things? In short, how smart is the stack? It would be nice to be able to say “Radio.transmitNewData()”, and it would handle talking to the storage layer (we would hand it an I2C address at setup), get the data, and send it. This suggests the storage layer has “Storage.getNewData()”, and it knows which datapoints are considered “new,” because it maintains an internal pointer that is updated everytime this is invoked. (We should also be able to “getNewDataPointer()” and similar.) Ideally, though, we can either squirt data directly at the radio from the controller, or the controller can hand off the issues of getting all of the data and sending it, so that the control code looks simple, and the complexity is implemented (once, and correctly) in the interface to all Radio and Storage layers.

Other Boards

Using the above board as examples, we also imagine boards that might have flash or FRAM for storage, boards where we connect I2C-based sensors, a power distribution board (where we have a voltage regulator and battery connections), and… so far, that might be it.


For this project, it allows students to cut their teeth on circuit design one wafer at a time. Instead of trying to design a single board with everything on it, we design multiple boards that each do just one thing. This makes gives students the opportunity to design or revise wafers in (conceptual) isolation, as well as provide an ongoing source of projects. If we decide to scale down to 1.5” or 1” PVC, then we have a whole redesign process… but, can do it piecewise.

The stackable/wafer approach is also nice from a programming perspective; any board that has a local listener will be written to receive I2C messages, and do things in response. This makes the state machines for each wafer simpler to write: sleep, wake, process message, go back to sleep.

The individual wafers have no intelligence in the context of the overall sensor, which means we can develop and test each wafer’s API, have confidence in that wafer, and then integrate it into the sandwich stack. “Unit testing” of individual wafers is a huge win for this kind of application.

The controller is also interacting over a common protocol, and we can create small OO wrappers, so that we have a common API across all objects. Every board, for example, should have a “goToSleep()” method. We then develop the driver code so that where there are differences, we simply don’t care. That is, we call “Clock.goToSleep()” to put the DS3231 to sleep, even if that is actually a different set of I2C commands than if we put a storage layer to sleep; in either case, we invoke the “goToSleep()” method.

It is also designed for the enclosure. This is a problem with many COTS boards: they are designed as hobbyist boards that are breadboardable, but they are not designed for low power/extended battery usage, and they are not designed for a particular enclosure system. Here, we settle on PVC, and design against the constraints of a cylindrical enclosure.

We can reuse code from similar projects (Rocketscream’s low-power library, the OpenLog firmware may serve as a starting point for our storage layer), and we stay in the Arduino ecosystem with the lowest-power chip in that ecosystem.

It would be nice to be able to buy Adafruit Feathers, design against the Feather “standard,” and be done. However, Feathers are not designed for low power usage. So, we could pick the standard, but we would be in a position of designing all new boards.

That said… Feathers are open source. Therefore, we could take (say) the 328P board, rip off anything we don’t want, and then use it.

But, we would need to 3D print a harness that held the Feathers at an angle to fit in the board, or run them along the central axis… and, then, we would have limited-or-no stackability. This eliminates the benefits of the Feather as a formfactor.

Quiic (from Sparkfun) has no way to control power on each board. There’s no form-factor standard. There’s no clear way to mount them in a given enclosure. If we use PVC, we will end up with a mess of boards wired to each-other, which feels… messy. It lacks support for SPI, which may (for some sensors) be critical to our applications. There is no provision for GPIO signaling if we absolutely need it.

Grove has similar drawbacks to Quiic, but lacks the commitment to a single protocol.


We have to design everything.

We have to develop the API, and write all of the code.

We aren’t starting with a CircuitPython-compatible CPU. It would be really nice to design with the SAMD21 or SAMD51, but… with limited experience, and unknown library support, this could lead us into a space where we’re doing more embedded software engineering with students new to embedded design than we like. We can always transition over time; because we are designing against an I2C API, we can (for example) replace the controller board with a CircuitPython-ready CPU, and program it in Python, and still have the same abstractions.

Rebuilding the TVM 20190423
Debugging I2C 20190309
Motivation for a New Embedded Formfactor 20180605
Thoughts on Sensor Design 20180502

Reflections on Day One

Reflections on Day One

If you’re curious what a (long) blog post from Prof. Jadud looks like, take a look at the DCS website. You can catch some of his reflections on the first day.

A New Term

A New Term

For reasons that are becoming clear, we refer to this as the winter term at Bates…

Lynx Rufus provides a landing page for:

  • Courses I teach at Bates,
  • Projects that students and I collaborate on, and
  • Research students engage in that I support and mentor.

And, who knows what else. This term, I’ll be asking students in DCS 102 to blog; I might do that here, I might do it on the DCS homepage. (Probably the latter.)

Reflections on Day One 20180109
A New Term 20180108