I've seen an issue on a 14-node, 250k CANBUS (with a mixture of 11-bit and 29-bit nodes) where CAN frames are often corrupted by some incorrect bus activity. The screenshot explains it more clearly.
This generally happens around 10-15 seconds after the system is powered up and the error can last anywhere between about 100 bit times and 268bit times; the screenshot shows an event that is 205 bit times. For the longer disturbances, this can obviously cause some nodes to enter the "bus off" state.
I think what is happening here is there is that a good CAN frame gets as far as the data section being transmitted when some other node begins to apply dominant bits which may go undetected initially, as the transmitter's data may contain a number of 0s. At some point, either the frame transmitter detects a dominant bit when it is trying to send a recessive and/or other nodes fail to see a stuff bit and then an error frame is signalled (the section with the largest amplitude). The bus is then left in a dominant state, presumably by just one node, but this node eventually releases the bus and allows it to operate normally again.
Initially, I thought it might be a node that is unsynchronised with the rest of the bus starting a CAN frame when it is not supposed to, but it seemingly makes no attempt to put a legitimate frame out, even assuming it was running at a very low baud rate, but I don't see why the number of dominant bits would vary.
Has anyone experienced this kind of error before/can offer any possible solutions?
I've not seen that any nodes were missing before the error then present afterwards, which would suggest a culprit and I've started to take nodes off the bus one-by-one to see if the problem goes away but any other suggestions would be welcome.
Thanks in advance for any help.