6
\$\begingroup\$

This is a question for anyone with experience designing or with a deep knowledge of volatile memory in an ASIC. E.g. chip designers or silicon process engineers.

We are using the ET1200 EtherCAT ASIC (datasheet) in one of our projects. First some background:

The ET1200 chip:

The ET1200 is a industrial Ethernet based networking chip that essentially gives a host PC and a microcontroller shared memory. The PC (over Ethernet) and microcontroller (over SPI) can both read and write into the small amount of dual-ported memory inside the ET1200.

Our project:

Out of 50 PCBs assembled, we have about 25 that showed symptoms of a strange bug. On some boards the bug was not seen at first, but then increasingly appeared more often, until eventually it was seen very often. Meanwhile, some boards never exhibited the bug.

What was the bug? Eventually, we tracked the bug down to a microcontroller reading uninitialised memory from the ET1200. I.e. after power up, and before the PC had written to any memory in the ET1200, the microcontroller would attempt to read that memory, but would obviously be reading uninitialised memory from the chip.

This firmware bug has now been fixed, but I'm still curious about the behaviour of the uninitialised memory inside the ET1200 ASIC.

Thoughts: Presumably the memory was not being zeroed after the ASIC reset. Sometimes the memory would read mostly zeros (that doesn't cause a bug) and sometimes it would read lots of random numbers. Sometimes those random numbers would survive a power cycle. I don't know exactly what kind of RAM is in this chip.

Questions:

  • What factors might affect the probability that an uninitialised bit in the ET1200's memory will read as 0 or 1?
  • Is it to be expected that some chips will appear to typically read all zeros, while other read random numbers?
  • What are some actual, physical mechanisms that could cause chips to change behaviour of uninitialised RAM over time? I.e. start out reading mostly zeros, but eventually start reading lots of non-zeros. It's almost as if the memory is 'burning-in'. (warming or cooling the boards did not seem to help).

What kind of answers am I interested in?

I would like some insight into how SRAM memory cells are constructed in a typical ASIC, and what mechanisms affect their reset value, and if there is a mechanism which affects their reset value over time. I am of course aware that the behaviour of uninitialised memory is undefined.

\$\endgroup\$
7
  • 2
    \$\begingroup\$ Figure 32 shows it does indeed have formal POR behaviour... I'm surprised that a device like this / for this target audience doesn't have well defined state for the RAM on reset! \$\endgroup\$
    – Attie
    Commented May 16 at 13:59
  • 1
    \$\begingroup\$ @Attie - Yep, it certainly does reset the core logic, but doesn't seem to initialise the memory. \$\endgroup\$ Commented May 16 at 15:42
  • 7
    \$\begingroup\$ In general, I would never expect any RAM of significant size (in an ASIC or otherwise) to have a global hardware reset. Except in very niche applications, it is always more efficient to have a controller of some sort write initial values to the memory at start-up. If the controller inside the ASIC isn't doing that, then you need to do it with your external controller. Better to never rely on the value of any RAM that you haven't previously written to, which is presumably the firmware bug that you fixed. \$\endgroup\$
    – Dave Tweed
    Commented May 16 at 19:09
  • 3
    \$\begingroup\$ @Rocketmagnet The ET1200 datasheet Table 4: ET1200 Feature Details has an RAM initialization entry with a legend of -, which means not available. I.e. the feature isn't present. Not sure if other EtherCAT ASICs have that feature. \$\endgroup\$ Commented Jun 10 at 13:36
  • 1
    \$\begingroup\$ @Rocketmagnet The answer SRAM isn't blank on powerup, is this normal? might help to understand the effect of different SRAM implementations. I'm not a chip designer, so can't verify the physics in that answer. \$\endgroup\$ Commented Jun 10 at 14:04

3 Answers 3

2
+100
\$\begingroup\$

Based on https://www.eng.auburn.edu/~uguin/pdfs/JETTA-TRNG-2019.pdf and https://en-support.renesas.com/knowledgeBase/20840882 -

The initial values in an SRAM is in theory highly random but practically influenced by slight process variation leading to pattern which can be pretty stable for a specific die, if the power sequence/ramping is reproducable and matches the conditions. This pattern will change over time due to stressing and aging.The paper describes how aging can be used to increase the randomness of th SRAM initial state. Aging leads to shifts in the threshold voltages according to the authors and NMOS ages differently than PMOS. They induce aging by applying voltage (i.e. keep a cell in one state) for prolonged time.

\$\endgroup\$
1
  • 1
    \$\begingroup\$ Thank you. This answer is the closese so far to what I'm looking for. The paper linked is extremely interesting. \$\endgroup\$ Commented Jun 10 at 23:30
4
\$\begingroup\$

Big picture, you cannot count on any 'non-volatile' behavior from an ordinary SRAM. You must initialize it when your system starts up, or within your driver program for a peripheral. There is no such thing as a non-volatile SRAM cell.

Let's review what's in an SRAM. ASIC SRAMs are constructed from latch cells. From a circuit perspective, these can be thought of as a pair of cross-coupled buffers equipped with bidirectional bit lines that are used to read or write data from or to the cell.

SRAM cell example:

enter image description here

from here: https://moodle.insa-toulouse.fr/file.php/58/content/static_ram.html

This SRAM cell is a 6-transistor (6T) cell, a common type for CMOS. The 6T cell works as follows:

  • read: select word line, bit lines are high-Z. Latch state propagates onto to bit lines and is read by column sense amps.
  • write: select word line, bit lines are driven by write buffers to force the latch to the new state.
  • hold / keep: word line is off, latch self-reinforces its last state due to positive feedback.

The picture link gives more details.

One thing to notice is that the latch structure is symmetric. Consider the cross-coupled buffers by themselves. Without any external influence from the bit lines or other associated circuitry, and assuming exactly equal buffer behavior, the power-on state will be randomly 1 or 0, with equal probability.

What could make a latch come up with a less than random state? What could make the probability shift over time, or with system condition? If the buffer behaviors become not exactly equal, a slight difference creeps in and pushes the latch to one state or another, skewing the state distribution to something other than 50-50.

Let's start with the word line. If it were to pulse during power-on, the charge present on the bit lines could transfer to the latch. If those charges were at all unequal this would influence the latch state during power on.

What if the cross-coupled buffers weren't exactly symmetric, but instead had a slight difference in threshold or drive strength? This too would influence the power-on state, in a way that can shift with temperature, voltage, and power-on time (that is, age.)

Speaking of voltage, what if the power rails to each buffer weren't perfectly equal? Again, this could influence power-on state as this influences drive strength and threshold. Vdd / Vss IR drops are turn are influenced by other on-chip activity, so nearby logic activity could cause this influence to shift.

In any case, these process/voltage/temperature/age influences don't have to be much, they only need to be just enough to give the latch a slight nudge one way or another.

And the post-power-on memory state may have nothing to do with physics at all. Some ASICs with large internal memories will include built-in self test (BIST) that is activated once power is stable. The BIST pass tests RAM blocks, and might even re-map RAM blocks place of bad ones. Regardless, the BIST pass will leave a pattern behind in RAM. More about memory BIST here: https://www.vlsi4freshers.com/2019/12/memory-built-in-self-test-mbist-basic.html

As a board or system designer, ultimately you have little control over the SRAM power-on state. You can't even count on it being random, a topic this paper explores.

Now, how can one make SRAM be non-volatile, or behave as if it were so? With some extra support circuitry it's possible.

SRAM can be made non-volatile by employing battery backup. In a battery backed SRAM, the latch array is powered from a separate always-on supply, so the latches retain their state while the rest of the system is powered down. CMOS latches use almost no power when keeping state, so the battery requirements are very modest. A familiar use of this is PC BIOS, but it also shows up in other places such as car radios to store station presets. Battery backup adds cost and poses some issues in manufacturing (batteries don't tolerate soldering; lithium types can catch fire or explode), so it isn't a good general SRAM solution.

Another way to make SRAM nonvolatile is to 'fake' it, by using Flash or EEPROM as a shadow for critical variables. The shadow data is stored before power-off and reloaded at startup. This is very cost-effective, but needs careful attention to software design and power-down behavior to ensure that SRAM state makes it to the nonvolatile island. Also, Flash and EEPROM have wear issues so aren't good choices if the critical SRAM contents require frequent updates.

Infineon offers their NvSRAM technology, which pairs each SRAM cell with a charge storage cell alongside. At power-on, the stored charges is transferred to the SRAM latch array, restoring the latch states. This is a pretty neat solution in my opinion, but it may be costly. More here: https://www.infineon.com/cms/en/product/memories/nvsram-non-volatile-sram/

A directly competitive technology is MRAM / FRAM (magnetic / ferroelectric RAM.) MRAM/FRAM, unlike Flash or EEPROM, doesn't have wear issues. MRAM/FRAM devices primarily see use in industrial and automotive in smaller (16Mbit or less), but so far has proved difficult to scale up economically to replace DRAM. More here: https://www.everspin.com/mram-replaces-nvsram

Memristors have been proposed as another SRAM alternative. It's still early days for memristor RAM; despite the hype, I'm not aware of any SRAM-replacement products based on them. There is active interest in it for networking and AI use.

\$\endgroup\$
3
  • 1
    \$\begingroup\$ Perhaps I don't understand the question, but this answer seems to be more about how to get non-volatile SRAM. Whereas think the question was asking why SRAM doesn't initialise to a deterministic value, e.g. all zeros, at power on in the absence of a user initiated reset mechanism. \$\endgroup\$ Commented Jun 10 at 20:19
  • 1
    \$\begingroup\$ The question is actually deeper than that: why the uninitialized value seems to change over time. Their question also raised the issue of how to make SRAM nonvolatile. I covered both. \$\endgroup\$ Commented Jun 10 at 20:26
  • \$\begingroup\$ Thank you. You put a lot of effort into this answer and I appreciate it. I must apologise for my brain fart, but I accidently wrote "non-volatile" when I mean to write "volatile". I have fixed the question. Sorry. \$\endgroup\$ Commented Jun 10 at 23:25
0
\$\begingroup\$

Think of how they’d have to implement it. The SRAM hardware doesn’t have a global clear line. Clear-on-reset means there’s a reset state machine and a counter, and the reset machine writes zero to each RAM location.

In many cases, they just don’t bother with that. It’s something I personally do for reset when I can, but not many ASICs seem to be doing it.

It looks like that particular ASIC doesn’t do any RAM initialization. That’s OK, it doesn’t have to. The firmware should expect uninitialized RAM.

The answer to all 3 of your sub-questions is: if it’s not explicitly initialized, it can read as anything, and the details depend on the IC process used and the design of the cells. So, in a way, you’re splitting hairs: since hardware doesn’t initialize SRAM, it’s perfectly acceptable to read zeroes on some chips, and non-zeroes on others, since you must expect to read literally any value from an uninitialized location. Zero holds no special meaning in that case. It’s just a coincidence.

\$\endgroup\$
1
  • \$\begingroup\$ Thank you for your answer. As mentioned in the question, I am aware that uninitialised memory could read anything at all. What I'm interested in is the physics that's going on in the memory cell that could cause it to swing one way or another, and what might lead its behaviour to change over time. \$\endgroup\$ Commented Jun 10 at 23:26

Not the answer you're looking for? Browse other questions tagged or ask your own question.