# LSTM

简称LSTM，由Schmidhuber等人在1997年提出

对循环层单元进行改造，避免用前面的公式直接计算隐含层状态值

![](/files/-LqCqGMhyL6mNi7rDbXn)

## 使用输入门、遗忘门、输出门3个元件

通过另外一种方式由$$h\_{t-1}$$计算$$h\_t$$，状态值由输出门与记忆值确定

$$
h\_t=o\_t . tanh(c\_t)
$$

## 输出门

![](/files/-LqCqGMlghomr2fTkj3v)

计算公式和RNN相比，改成了：

$$
o\_t=\sigma(W\_{xo}x\_t+W\_{ho}h\_{t-1}+b\_o)
$$

## 记忆值

![](/files/-LqCqGMnHE3PJNHF9stR)

记忆值是循环层神经元记住的上一个时刻的状态值，随着时间进行加权更新

$$
c\_t=f\_t.c\_{t-1}+i\_t.tanh(W\_{xc}x\_t+W\_{hc}h\_{t-1}+b\_c)
$$

## 遗忘门

![](/files/-LqCqGMp1vB8O_UZzp1_)

计算公式为

$$
f\_t=\sigma(W\_{xf}x\_t+W\_{hf}h\_{t-1}+b\_f)
$$

## 输入门

![](/files/-LqCqGMrwZnhn7WeWiws)

输入门控制着当前时刻的输入有多少可以进入记忆单元

$$
i\_t=\sigma(W\_{xi}x\_t+W\_{hi}h\_{t-1}+b\_i)
$$

## 评价

> 输入门作用于当前时刻的输入值，遗忘门作用于之前的记忆值，二者加权和，得到汇总信息，最后通过输出门决定输出值
>
> 如果将LSTM在各个时刻的输出值进行展开，会发现其中有一部分最早时刻的输入值避免了与权重矩阵的累次 乘法，变成了加法，一部分前面的信息和一部分新的信息相加，构成新一轮的记忆值，这是LSTM能够缓解梯度消失问题的主要原因


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://im-qianuxn.gitbook.io/pytorch/ji-suan-ji/shen-du-xue-xi-li-lun/rnn/lstm.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.