fbpx

Network-Engineering | < 1 min read

Junos Nonstop Active Routing and rpd failure

ngworx Team
September 2019
written by Remi Locherer
Senior Network & Security Engineer

How to prevent downtime in case of rpd failure on redundant Junos devices. 

We recently had a support case where the routing protocol process (rpd) on a Juniper MX router crashed. This is very bad and should of course not happen. But since every piece of software has some bugs it’s not too big of a surprise that also rpd can crash. On Junos, OS rpd is the process which implements the routing protocols like BGP or OSPF. It maintains adjacencies and sessions to neighbouring routers.

The affected MX router was equipped with two routing engines and Nonstop Active Routing (NSR) was active. It was a surprise that this rpd crash caused downtime! The expectation was that the backup routing engine (RE) takes over and the neighbouring routers do not notice.

It turns out that this router missed the configuration statement “set system switchover-on-routing-crash“.

Only with this setting added a failover happens when rpd crashes.

In the meantime, we’ve seen more NSR configurations that miss this important setting. We recommend that users of NSR always activate switchover-on-routing-crash.

ngworx Team
September 2019
written by Remi Locherer
Senior Network & Security Engineer

Most Popular

Network-Engineering | 8 min read

Junos upgrade – filesystem is full

Not enough storage during Junos upgrade (EX2300 and EX3400). An extension of Juniper's article…

Read more

Juniper Networks

Want to learn more about Juniper Networks? Discover their solutions, products, awards, team leaders, partners, training programs, and the latest events by clicking the button below.