Around five years ago I performed some measurements of interrupt response times in a Raspberry Pi Zero and an Arduino. My goal was to get some rough estimates of what kind of real-time performance you can expect from these systems. I was not interested in pushing them to their limits. I wanted to compare the most straightforward approaches - code you would find in documentation or in examples that pop up on top of web searches. This year the Raspberry Pi Pico was released and it promises to become just as popular. It brings some interesting new features that I wanted to explore, like MicroPython and the programmable I/O (PIO). I thought it would be interesting to repeat my old measurements and see how well it compares to the other two systems.
I only briefly summarize my previous results here. Read my original blog post for a longer introduction, description of the test setup and more in-depth discussion of the first batch of measurements. In the follow up post I also dug a little deeper into the reasons behind some of the more unusual results I got with Arduino and Raspberry Pi Zero.

For the purpose of this test, the interrupt response time is the time the system takes to change a state of an output GPIO pin in response to the change in an input GPIO pin. In real applications there is usually some kind of processing involved, so this value represents only the best-case scenario of how fast the software can respond to external events.
This response time was measured using a signal generator and an oscilloscope. A square wave generated by the signal generator was connected to the input pin. The two-channel oscilloscope was connected to both the input pin and the output pin. It was setup to measure the interval between the two state changes. The measurement was automated and repeated 500 times for each setup. Exact settings used are noted here.
To perform the test with the RP2040 processor on the Raspberry Pi Pico I installed a MicroPython firmware, as described in the Getting Started guide. I tried two implementations: A pure Python implementation was using the machine.Pin built-in class to configure a Python function as an interrupt handler. The PIO implementation used the rp2.asm_pio
decorator to program the PIO state machine from Python code (see Section 3.9 in the Python SDK manual). After the state machine was programmed, the input was handled purely inside the PIO and the Python interpreter was not involved. You can find exact code I used in the GitHub repo.
Here is how the new measurements with the RP2040 compare with Arduino and the Raspberry Pi Zero:

The MicroPython implementation on the RP2040 (yellow) has the average response time of around 60 μs. This is around 3.5 times faster than using a CPython implementation on the Zero (cyan) which averages at around 210 μs. It is also more consistent, with less spread between minimum and maximum response times. A surprising result at the first glance, since Zero has a much more capable CPU running at up to 1000 MHz while the ARM core in the Pico only runs at 125 MHz.
The difference is very likely due to all the Linux kernel housekeeping and context switching that happens before the interrupt is propagated from the hardware to the Python process. MicroPython, while quite complex, is still a lightweight interpreter compared to the full CPython on the Zero. This is consistent with the fact that a C implementation that runs in the kernel on the Zero (blue) is much faster than MicroPython on the RP2040.
The following figure zooms in on the left end of the histogram:

Here you can see that the PIO implementation is amazingly fast compared to all previously tested configurations. With the average response time of 0.043 μs it beats both the polling and the interrupt-driven C++ implementation on the Arduino by two orders of magnitude.
This comparison is a bit unfair though. The specialized PIO state machines on the RP2040 are indeed very fast, with only 8 ns per instruction and an instruction set that is optimized for responding to input events. However, the amount of processing you can do with them is very limited compared to all other approaches I've tested. Each PIO can only process 32 instructions. Most real-life applications beyond interfacing with a simple bus protocol will need a round-trip to MicroPython. This puts the response time back into the hundred-microsecond range.
Still, investigating PIO performance is interesting. Here is another level of zoom to show only the distribution of response times by the PIO implementation:

The response times should be in the range of 4 to 5 instruction cycles - 2 cycles for the input synchronizer (see 3.5.6.3 in the RP2040 Datasheet), between 1 or 2 cycles for WAIT and 1 cycle for SET. I did not use any clock dividers and used the default 125 MHz system clock, so each instruction takes 8 ns. This gives the range of response times between 32 to 40 ns.
I measured between 38 and 48 ns. Very likely this is a measurement error. Unfortunately my signal generator has a rise-time of around 10 ns. This means that in the nanosecond range the transition between low and high logic level is not well defined and this introduces an error into my measurement. I verified by other means that one PIO instruction indeed takes exactly 8 ns in my setup. It is also possible that I missed something and there is an additional PIO cycle (or two) needed somewhere before the response propagates to the GPIO pin.
On the oscilloscope screenshot below, the blue trace is the stimulus signal from the signal generator and the yellow trace is the response generated by the PIO on the output pin. You can see that the rise times are not insignificant compared to the measured time interval.

In the end this was an interesting exercise. I was surprised by the performance of MicroPython on the Raspberry Pi Pico and how quick the development setup is. I honestly expected Python code to run slower and I was again reminded that my intuition can be wrong sometimes. Unfortunately I didn't have time to setup the C SDK to also try out a native implementation of the same test on the RP2040. Perhaps some other day.
Programmable I/O is certainly the most interesting part of the RP2040. It took me a while to understand the unusual instruction set and how the FIFO buffers work. I like how the integration of the assembler into MicroPython makes it easily accessible for experimentation. I was impressed by the performance and quick response times. On the other hand, I was also surprised by how limited PIOs are in terms of the program size and the choice of instructions. I was expecting something similar to PRUs on the Sitara SoC. PIOs seem indeed very specialized devices for interfacing with digital buses and can't do much more in terms of algorithmic complexity.