Ignoring pending interrupts in the Linux kernel
Recently I was writing a Linux kernel driver for some hardware I made that interacts with a serial bus. The hardware consists of a transmitter and a receiver. The bus is half-duplex and shares the same physical line for sending and receiving data. For simplicity there is no hardware way to unhook the receiver from the bus. Because of this, anything my transmitter sends on the bus will be simultaneously received by the receiver. I don't want my own outgoing data to end up in the software receive pipeline. Hence the function for sending data on the bus must somehow ignore everything that is received while the transmission is ongoing.
The code for sending data looks roughly like this:
static int foo_send(struct device *dev, const char *buf, size_t count) { struct foo_drvdata *ddata = drv_get_drvdata(dev); disable_irq(ddata->recv_irq); (code to feed data to the transmitter and wait until the data is sent...) enable_irq(ddata->recv_irq); return 0; }
Here recv_irq
is a hardware interrupt line that is triggered by my receiver when it has received some data from the line. It is setup to call a handler function (interrupt service routine) that fetches the received data from the receiver and puts it into the receive FIFO. It's hooked to a GPIO line with some code in the driver's initialization function that looks roughly like this:
static int foo_probe(struct platform_device *pdev) { struct gpio_desc* desc = ... ddata->recv_irq = gpiod_to_irq(desc); devm_request_any_context_irq(dev, ddata->recv_irq, foo_recv_isr, IRQF_TRIGGER_FALLING, "foo-recv", data); (a bunch of other initialization code here...) return 0; }
This is simplified quite a bit to hide unimportant details. For example, the send function itself is split up into parts called from various interrupts to avoid hogging the CPU while the transmission is happening and so on.
The gist of it however is that I'm trying to ignore the received data by disabling the receive interrupt while the transmission is happening. I know that during that time any data I receive will just be my own transmission reflected back to me. So I just want to ignore all interrupts from the receiver during that time. As soon as the transmission is completed I need to start processing the interrupts again so that I can capture any reply.
For the sake of completeness, I was developing this on the i.MX35 platform and using kernel 4.14.78, but the code should be reasonably platform independent.
Of course, things are not this simple and this doesn't work. When sending data, I would receive part of my own data back through the interrupt handler despite disabling the interrupt. After a transmit, I would receive zero, one or two receiver interrupts immediately after calling enable_irq()
. The number of interrupts depended on the amount of data I sent.
I knew this had something to do with interrupts being held in pending state while they are disabled and then being acted upon once they are enabled again. The fact that I received up to two interrupts made me suspect that interrupts were being held in pending state at two different layers. Figuring how this all works in the kernel and how to actually ignore interrupts on purpose took quite some time.
I found some related questions on Stack Overflow, but they weren't particularly helpful. One comment suggested that this an unreasonable thing to do (it's not) and the other suggested recompiling the kernel without CONFIG_HARDIRQS_SW_RESEND. This would affect things outside of my driver, which even if it worked, wasn't something I wanted to do. I also found a paragraph in the GPIO Driver Interface documentation that talks about a fringe use case of enabling and disabling interrupts in the CEC driver that sounded like what I wanted to do, even though I'm not using the same GPIO lines for input and output. This too turned out to be a dead-end, since I did not want to modify the i.MX35 GPIO driver.
The first clue I got was this comment above the irq_disable()
declaration in kernel/irq/chip.c
:
* If the chip does not implement the irq_disable callback, we * use a lazy disable approach. That means we mark the interrupt * disabled, but leave the hardware unmasked. That's an * optimization because we avoid the hardware access for the * common case where no interrupt happens after we marked it * disabled. If an interrupt happens, then the interrupt flow * handler masks the line at the hardware level and marks it * pending. * * If the interrupt chip does not implement the irq_disable callback, * a driver can disable the lazy approach for a particular irq line by * calling 'irq_set_status_flags(irq, IRQ_DISABLE_UNLAZY)'. This can * be used for devices which cannot disable the interrupt at the * device level under certain circumstances and have to use * disable_irq[_nosync] instead.
The imx35-gpio driver (gpio-mxc.c
) that implements the native GPIO functions on i.MX35 indeed does not implement the irq_disable()
callback in struct irq_chip
. As the comment explains, in this case the kernel only marks the interrupt as disabled in its own data structures when disable_irq()
function is called. It does not touch the hardware at all, since the expectation is that interrupt will only be disabled for a short amount of time and in most cases the interrupt will not happen at all. Apparently hardware register access is slow and developers wanted to avoid it if possible.
If the interrupt does happen, the kernel does not run the disabled handler, but flags the interrupt as pending in the internal kernel data structures. Only then does the kernel actually disable the interrupt in hardware. The pending handler will be later called by the kernel from enable_irq()
. This was one way I was getting a delayed call to my receiver interrupt handler!
Helpfully, the comment also hints that this delayed mechanism of touching the actual hardware can be disabled by setting a flag after we request the interrupt line:
irq_set_status_flags(ddata->recv_irq, IRQ_DISABLE_UNLAZY);
Doesn't the double negation in "disable unlazy" resolve to "enable lazy"? That seems counterintuitive, since setting the flag disables the lazy behavior. But I digress. Setting IRQ_DISABLE_UNLAZY
in my code after requesting the interrupt line fixed one spurious call of the interrupt handler. However I was still getting one call even after this fix.
The answer to that remaining call was hidden in careful reading of that comment above disable_irq()
. The hardware interrupt is masked by the kernel, not disabled.
On i.MX35 the interrupt hardware has a interrupt mask register (IMR) and an interrupt status register (ISR). When an interrupt source is triggered, the corresponding bit in the ISR is set. Setting the ISR bit interrupts the processor and runs the interrupt handler, unless the same bit is set in the IMR. As long as interrupt is masked by the IMR bit, the processor will keep running and the interrupt will be deferred. The moment the IMR bit is cleared, the processor will run the pending interrupt. The ISR bit is later cleared by the kernel as part of the interrupt handler mechanism. The kernel knows about all this through the struct irq_chip
setup by gpio-mxc.c
.
This is the second, hardware level, of pending interrupts that was responsible for my receive handler to be called. On the hardware level the interrupt was still happening, but not acted upon because disable_irq()
masked it in the IMR. As soon as the bit was unmasked in enable_irq()
the interrupt triggered and my handler was called.
The best way to fix this would be to prevent the interrupt from happening in the first place: to actually disable the interrupt on the hardware level in the GPIO subsystem, not just masking it with disable_irq()
. Unfortunately, I could not find a clean way to do this through the GPIO driver layer and poking the i.MX35 GPIO registers directly from my otherwise platform-independent driver seemed like an obviously bad idea. I'm guessing the best way would be if gpio-mxc.c
would actually implement that irq_disable()
callback mentioned in kernel/irq/chip.c
.
In the end I settled to clearing the ISR before calling enable_irq()
. A way to do that in what I believe is a reasonably cross-platform way is to call the interrupt acknowledgment callback like this:
struct irq_desc *desc = irq_to_desc(ddata->recv_irq); if(desc->irq_data.chip->irq_ack) { desc->irq_data.chip->irq_ack(&desc->irq_data); }
Adding this to the send function finally removed the last spurious receiver interrupt handler call in my driver.
What is the take away from all this? When you disable an interrupt in the Linux kernel, the interrupt typically isn't disabled on the hardware level. What disable_irq()
actually does is disable the call to the interrupt handler. From the perspective of kernel developers it is primary a software synchronization mechanism. For example, the enable_irq()
/ disable_irq()
pair is used to protect critical parts of the code to prevent race conditions when the main kernel thread and the interrupt handler are accessing the same data structure. The kernel tries hard to serve all interrupts, even those that happen while the handler cannot run immediately.
Actively trying to throw away interrupts as they happen is discouraged and seen as an unusual use of the interrupt system. The kernel requires you to jump through some hoops to work around all the layers where pending interrupts are held. The preferred way is to not make the interrupts happen in the first place. However, this may not be possible: the hardware may not support disabling the reason the interrupts happen and some GPIO drivers lack the hooks that could be used to disable and enable GPIO interrupt lines on the fly.