Feng Dan: memristor RRAM most promising to replace DRAM

Recently, the annual China Storage Summit was held in Beijing as scheduled. 'Data flow, riot boat' is the theme of this year's conference. The issue is about the future of storage and the value of data release. The industry guests present the status quo and development of the storage market in China and around the world In the afternoon, the third sub-forum, Feng Dan, director of the Information Storage Special Committee of China Computer Federation, as an opening guest, even though the trend of memristor integration and RRAM (resistance change memory) performance optimization method Feng Dan said that the current memristor presents a large capacity, the integration of computing and storage trends, and RRAM large capacity, high speed, low power consumption, RRAM is also considered the next generation instead of DRAM (Dynamic Random Memory) a good choice.

Feng Dan introduced memristor from three aspects related to the development, the first is the market demand, IDC predicts that by 2020 the global data volume will reach 40ZB, a strong amount of data, on the other hand is the storage needs, including high performance computing Storage needs, and a variety of network applications, the demand for storage is faster.For example, 12306, more than 30 billion PV operations per day during Spring Festival, concurrent access to 1.3 GB of data per second, the demand for memory is very large , Including big data analytics, are all in memory, and the memory required for large-scale computing will be 1,000 times more, with a huge disparity in memory requirements and supply.

Memristor RRAM most promising to replace DRAM

At present, DRAM stores data with the amount of charge in the capacitor. The capacitor must be designed to be large enough to increase the retention time and reduce the refresh rate, thus resulting in limited capacity and energy consumption, difficult process technology degradation, and increased CPU performance Fast, the memory capacity growth is much lower than the CPU performance growth rate, which is commonly referred to as the problem of memory is strong, the other is the energy consumption, with the further increase in capacity, further increase in leakage power, the server 40- 50% of the energy comes from memory, and 40% of DRAM's energy consumption comes from refresh.

ITRS report pointed out that DRAM is difficult to maintain scalability below the 20nm technology node, DRAM technology will stop after reaching X-nm, when the DRAM process to a few nanometers after scalability is limited.Feng Dan said, including more Spin transfer, including the memory of the most typical representative is the memory resistance change, through continuous research and development, the current RRAM capacity, very fast and low energy consumption, it is also considered the next generation instead of DRAM A good choice.

Taking RRAM as an example, the main principle of metal oxide memory device is to use memristor for memory. The first principle is that in the low resistance state, the memory can break the conductive wire and become a high impedance state. This operation time is compared Long, large delay, also in this state, coupled with a certain size of the voltage, it makes the conductive wire from a high resistance state into a low resistance state.

There are two structures for the RRAM array, one is a cross-point structure. The structure of a 1T1R single-transistor array is that an access transistor is required at each crosspoint to independently gate each cell, but its disadvantages It is also clear that the total chip area of ​​a 1T1R-structured RRAM depends on the area occupied by the transistor and thus the storage density is low. The Crossbar structure is also of interest, with each memory cell located at the horizontal word line (WL) and the vertical bit line BL). Each cell occupies an area of ​​4F² (F is the technical feature size) and reaches the theoretical minimum of the single-layer array. The advantage of this is that the memory density is high while there is a voltage drop across the interconnect And sneak current path, resulting in decreased read and write performance, increased energy consumption and write problems such as interference is its shortcomings, a lot of research are built around this category.

The biggest drawback of RRAM is its severe device-level variability. The change of state of the RRAM device requires controlling the drift of the oxygen ions under the electric field and the diffusion under the thermal driving by applying a voltage to the two electrodes so that the conductivity Difficult to control the three-dimensional appearance of the wire, coupled with the impact of noise, resulting in device-level variability. Device-level variability is to create a reliable chip products, the key issue.

Large capacity, computing and storage depth into a memristor fusion trend

RRAM with Crossbar structure has more RRAM storage capacity than 1T1R structure, and SLC performance is higher than that of MLC. However, the memory capacity of RRAM prototype chip gradually increases from Mb level to Gb level, and the technology nodes gradually decrease and the read / write performance gradually increases. According to the development of capacity and read-write bandwidth, although RRAM is relatively late in development, storage capacity grows rapidly, and RRAM has more advantages in reading and writing bandwidth compared to PCRAM and STT-MRAM.On the other hand, Resistor neuromorphic computing systems are also evolving. A Crossbar array of memristor can be used to accelerate the matrix vector multiplication commonly found in neuromorphic computation. As an analog computation, Crossbar array needs to be solved in order to improve the computational accuracy The voltage drop on the interconnecting wires and the reliability issues caused by device changes have profoundly converged computing and storage.

In terms of device variability, the change of state of memristor approximates logarithmically normal distribution, so all the memristor in the array needs to be tested in advance, and the regularity of variation is obtained by counting their resistance state distribution. Exchange two or two columns of the weight matrix, and at the same time, exchange the elements corresponding to the input and output vectors so that larger synaptic weights are mapped into the memristor with less variability, thereby reducing the variability of the network output .

When the neural network is relatively large in scale, the conventional two-dimensional array requires a lot of arrays to be jointly calculated and the energy consumption increases. After the three-dimensional structure is adopted, the columnar motors are on the same plane, so that the calculation of the entire large-scale neural network can be reduced Consume, and can achieve lower latency. In addition, you can achieve the logical budget to meet the changing computing needs.

AI-based neural network evidence computing, when the capacity is not enough, by calculating in the storage space of too large capacity to reduce the data movement, can get better performance.At present, academia and industry have introduced some corresponding samples , But the actual product is still relatively small.SMC and the Chinese Academy of Microelectronics jointly developed the chip, in January this year, the United States Crossbar announced the 40nm process with SMIC 40nm process 3-D stacked 1TnR RRAR chip array officially Sample, the memristor really have to go through the use of a period of time, but the trend is large capacity.

How to optimize high-capacity RRAM performance?

The IR drop, due to line resistance and current leakage, reduces the voltage applied across the selected cell. However, the RESET delay of the ReRAM cell is exponentially inversely proportional to the voltage applied across it. An IR drop greatly increases the access delay. To reduce Small current leakage, the general use of half-bias write mechanism.In the alleviation of the IR drop problem, double-ended grounded circuit design (DSGB), reduce the IR drop wordline, greatly reducing the RESET delay for the 8-bit write 512 × 512 array, the worst-case RESET delay drops to 240 ns for 682 ns.

Using a zoned, two-terminal write driver, the array IR drop without the DSWD mechanism is severe and the RESET delay exponentially increases for an 8-bit write of 1024 x 1024 arrays.The DSWD mechanism reduces the IR drop on the bitline, Raises the voltage of the unit above 512 lines, has reduced the RESET delay greatly.

The line near the write driver has a smaller IR drop in the bitline and the access delay is smaller; the line access delay far from the write driver is larger

The crosbar array is divided into fast and slow regions according to the different delays of different rows. In the voltage bias based on the effective current path, the peripheral circuit closest to the target cell is selected to apply a write voltage to improve the voltage drop on the wires and reduce the write delay ; Block diagonal area division: to narrow the differences in cell access latency, reduce regional write latency, not only in the circuit, for TLC, memristor RRAM can be used to improve the performance of coding methods.

2016 GoodChinaBrand | ICP: 12011751 | China Exports