4
\$\begingroup\$

I am building an ESP32 WROVER-based custom PCB which includes CANbus 2B communications. This PCB is a copycat of my breadboard project.

Though on my breadboard version, the CANbus works like. charm, my custom PCB doesn't: I get those errors (extracted from my logs):

CANbus alerts #1: 0x0980 occurred (00000000 00000000 00001001 10000000)
CAN: ALERT_ABOVE_ERR_WARN
CAN: ALERT_BUS_ERROR
CAN: ALERT_ERR_PASS

Those errors meanings are as follows:

  • CAN_ALERT_ABOVE_ERR_WARN: One of the error counters have exceeded the error warning limit
  • CAN_ALERT_ERR_PASS: CAN controller has become error passive
  • CAN_ALERT_BUS_OFF: Bus-off condition occurred. CAN controller can no longer influence bus

On my breadboard version, I use the Waveshare CAN CANbus transceiver whose data sheet is right there: https://www.waveshare.com/wiki/SN65HVD230_CAN_Board.

On the PCB I use the MCP2551 whose data sheet is right there: http://ww1.microchip.com/downloads/en/devicedoc/21667d.pdf

On the PCB, I attached Rs to a 10kOhms to ground.

The H/L wires are quite shorts (1m) and using or not a 120 Ohms does not change anything.

I would appreciate any help on the matter, any clue or ideas for further investigations.

The Waveshare transceiver schematics

enter image description here

[UPDATE] Our PCB schematics sporting the MCP2551

enter image description here

[UPDATE] Adding information that Pynomial talked about in his answer.

  • We use a 250kbauds speed;
  • The BMS (battery management system) devices attached to the CANbus sports a 120ohms resistor between H and L =W this is why we did not put an extra one on our PCB.

Working on the PCB exclusively from now on => it works

After hours of investigations, I did 2 things:

  • I do not longer work on the breadboard but exclusively on our custom PCB => I guess the differences between the two makes me lose my time, so I prefer to get into that later on;
  • I dumped the Arduino lib for the CANbus in favour of the Espressif CANbus driver.

By doing those two things, the CANbus now works.

\$\endgroup\$
1

1 Answer 1

1
\$\begingroup\$

Could be a bunch of reasons reasons.

I think the most likely is logic levels. You had a working circuit with the SN65HVD230, which is a 3.3V part. The MCP2551 is a 5V part. The TXD pin is a TTL-compatible input, so if you're trying to drive it with 3.3V logic it won't work. The nominal threshold voltage for that pin is 0.75*VDD, i.e. 3.75V for a 5V supply. You'll need logic level conversion if you're driving from 3.3V.

Something else to be aware of is that you can't swap CANH and CANL. If you accidentally swapped CANH/CANL, the bus won't work reliably.

You picked slope-control mode by tying the Rs pin to ground with a resistor. The 10k resistor you used applies the maximum slew rate of around 23.5V/µs (see Figure 1-1 in the datasheet). This isn't necessarily a problem, but it depends on the bus rate you're trying to achieve. The datasheet states that for high-speed you'll want to put the MCP2551 in high-speed mode. This language isn't arbitrary: low-speed and high-speed have specific meanings in CAN standards. ISO 11898-3 is "low-speed" CAN, which operates at up to 125Kbps. ISO 11898-2 is "high-speed" CAN, which operates at up to 1Mbps.

If you take a look at the AC characteristics section, there's a specification in there for slew rates. They note that a 47kΩ resistor results in a slew rate between 5.5V/µs and 8.5V/µs. Compare this to the graph for Rs resistor values:

Graph of MCP2551 slew rate vs. Rs resistance

This implies that the resistor is setting the maximum slew rate, and the actual slew rate may as little as 65% of the maximum.

In the case of a low speed bus, you'll want to try to pick the right slew rate to match the data rate. The slew rate is the \$\frac {dV} {dt}\$, and high \$\frac {dV} {dt}\$ can lead to more ringing, EMI, and cross-talk. You generally want to pick the slowest slew rate that is reliable for your data rate. A good rule of thumb is that your slew rate should allow the bus to go from its minimum voltage to its maximum, and vice versa, in 5% of the bit period. You can calculate this as follows:

$$SR = \frac {r (V_{max} - V_{min})} {p \times 10^{6}}$$

Where \$r\$ is your bus rate in bits per second, \$p\$ is your target rise/fall percentage as a decimal (e.g. 0.05 for 5%), and \$V_{max}\$ and \$V_{min}\$ are the voltage extremes for the output lines. The result is the minimum slew rate in V/µs.

The minimum and maximum voltages should be based on the worst case for an individual line. In this case the CANH line can go from 2.0V (recessive) to 4.5V (dominant). If you're running at 100Kbps, the calculation would look something like this:

$$SR = \frac {100000 (4.5 - 2)} {0.05 \times 10^{6}} = 5V/\mu s$$

Remembering that Rs sets the maximum slew rate, and we need to ensure a minimum slew rate to meet our requirements, we can divide 5V/µs by 0.65 (i.e. 65%) to find the value that ensures the required minimum. In this case it's 7.7V/µs, or a resistance of around 51kΩ.

In the case of a 10kΩ resistor, the maximum slew rate is around 23.5V/µs, but the minimum ends up being 65% of that, i.e. 15V/µs. We can extrapolate from there to find the maximum bus rate that slew control could feasibly support while keeping rise/fall times within 5% of the bit period: 300Kbps.

The datasheet makes it clear that slew rate control should only be used for low speed, so if you're going above 125Kbps I would connect Rs directly to ground for high-speed mode.

Another issue is the fact that you skipped the 120Ω resistor across CANH/CANL. CAN bus has an open drain output, so if you fail to put the 120Ω termination resistor across the bus it may continue to float high after a logic high is asserted on the bus. It's not just there for impedance matching - it acts a bit like a pulldown to bring CANH and CANL to the same potential when the bus isn't being actively driven. The MCP2551 is explicit about requiring a minimum bus load of 45Ω. You need to terminate the bus on both sides.

Common-mode noise could also be a problem. You can compensate for this, to an extent, using a split termination design. Two 60Ω resistors and a capacitor to ground in the middle. This approach forms a low pass filter for common-mode noise, while leaving the differential-mode signal intact. You can also turn this into a biased split termination, which is what the VREF pin on the IC is for.

schematic

simulate this circuit – Schematic created using CircuitLab

(Note: 4.7nF was chosen based on the reference material I could find; typical values are apparently anywhere between 1nF and 100nF)

Make sure you use precision (1%) resistors, since the 60Ω needs to be carefully matched on each side to avoid asymmetric biasing.

You need both circuits to have a roughly equivalent ground potential. It doesn't need to be perfect, but DC offset rejection isn't infinite in these transceiver ICs. If you're running both circuits on separate supplies (particularly battery) then bridging the ground planes can be helpful.

The MCP2551 has a permanent dominant detection for extended low states on the TXD input, which means that your maximum bit time must be 62.5µs. This implies that you cannot run the MCP2551 slower than 16Kbps. If you try to run it slower, the bus drivers switch off until you bring TXD high again. This will mess up data transmission. Check the AC characteristics section of the datasheet for min/max timings.

If the device talking to the MCP2551 is trying to send data on startup, you might need to include a short hold-off time on startup while the MCP2551 comes out of power-on reset state. VDD must have reached at least 4.3V before trying to assert TXD. Since your ESP32 is running off a separate 3.3V rail, it could conceivably be coming up faster than the 5V, meaning you're sending data before the transceiver is properly powered.

Any of these things could be the issue, but I've tried to put the most likely ones first. That said, you should definitely include the 120Ω (or split termination) resistor regardless of whether or not it appears to make a difference. Not doing so is a recipe for random failures that lead to debugging nightmares down the line.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ Very impressive work! It gonna help for our design. We look thoroughly all the points you mentioned. I am going to add some information your answers hypothesis raised \$\endgroup\$ Commented Mar 7, 2022 at 10:18

Not the answer you're looking for? Browse other questions tagged or ask your own question.