2024-08-15, 22:51-今日內容

MySQL 是一个database management hiddent size在LSTM里面表示的是什么意思呢 LSTM的layer 在torch里他回返回两个返回值。也就是out, _ = self.lstm(x) lstm 在torch 里面是需要把batch有限显示的写出来的。不让他们不知道是不是batch优先左边开juno 右边链接juno connect iPad 键盘环球+ control 可以快速切换app 加⬇️ 就是然app切换下去 super（）.init()和super(你的class的名字， self).init() int64 和int 32 有啥区别吗 torch.float 是 float32 一模一樣 LSTM 从numpy 写scratch T81-558:Applications of Deep Neural Networks

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

MySQL 是一个database management Bench 是用来发送指令给mysql的

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

hiddent size在LSTM里面表示的是什么意思呢我的理解：你的意思是，这个hidden size 就是说吧我们输入的7个特征扩展或者压缩到这个hiddensize 对吗，就是增加维度或者缩小维度来表示他。然后经过他之后，再往后传递NN，最后是输出预测

这样理解是对的。

是的，hidden_size 的确是指将输入的特征扩展或压缩到这个维度。具体来说：

输入：7个特征（如loc_x, loc_y, 速度等）。

LSTM 隐藏层：

将输入的特征转换为 hidden_size 维度的隐藏状态。

学习到更高层次的表示和特征。

后续层：隐藏状态传递到全连接层等，最终输出预测值（如loc_x, loc_y）。

整个过程如下：

输入特征 (7维)。

LSTM 隐藏层 (转换为 hidden_size 维)。

输出层 (转换为最终的预测维度，如 2维的 loc_x 和 loc_y)。

通过这种方式，LSTM 可以捕捉到输入序列中的复杂模式和关系。

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

LSTM的layer 在torch里他回返回两个返回值。也就是out, _ = self.lstm(x) 或写成： out, (h_n, c_n) ‎ = self.lstm(x) 都是对的

这里的_ 也就是说 c_n 和 h_n

一些在LSTM里面非常重要的概念 hidden state 和cell state

这里的out又是什么呢：他的shape是这样的 (B， T， hidden_size)

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

lstm 在torch 里面是需要把batch有限显示的写出来的。不让他们不知道是不是batch优先

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

左边开juno 右边链接juno connect finder里面通过vpn连接我的4090 就能通过finder 找到某个Jupyter notebook的课程，然后再juno 里面打开。我可以ssh到juno connect 进行练习。这样每次我看到有意义的别的的notebook ，就可以下载下来，在我的linux上面clone他的ssh。然后学习了。这样学习只需要带iPad，确保vpn是对的就可以了。如果额外需要屏幕，才要带mac的电脑。但一般情况下，东西越少越好。

最好是一边学习，一边一同apple的note 记笔记。这样能够通过spotpy来查找自己之前的内容，特别方便又能形成长期记忆。

如果你问我为什么要用juno来看别人的Jupyter notebook 我会告诉你，safari或者chrome上打开notebook非常不友好。 juno就是为了notebook定制的，他能让你totally focus他的content 你懂了吗。

至于为什么我们不用colab？因为我们自己有4090！而且有vpn，是绝佳配置！！！！

根本不需要花钱跟别人强廉价资源了啊。哈哈哈哈

一般进入ipad 就是环球加上control 加上左边就把juno放在左边环球加上control 有右边把juno connect 放在右边

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

iPad 键盘环球+ control 可以快速切换app 加⬇️ 就是然app切换下去把鼠标放在某个app上，然后用环球和control 加上👈 或者👉 就是让他在右边固定，然后你就可以选择新的固定窗口了

quick note是环球加上Q

去看notification center是环球加上N

去看control center是环球加上C

去主界面是环球加上H

去看快捷指令是环球加上M

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

super（）.init()和super(你的class的名字， self).init() 他两的功能是完全一样的你只要写第一种非常简单的形式就可以了。第二种是第二版本的python的写法，已经淘汰了。

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

int64 和int 32 有啥区别吗我们一般在训练的时候都会让y 做成int64 的。因为它符合要求

特别是最后在计算torch.nn.crossEntropy的时候。他们明确强调了y的label必须是int64 。注意这是对category 的要求对我我们需要one hot encoding的东西由于我们需要进行一系列的操作，把她跟模型的预测结果进行各种计算。我们就要的是浮点数float32. 可以用啥啥.to(torch.float32)来进行操作

int32 和 int64 的区别主要在于它们的存储大小和表示范围：

int32:

大小：32位（4字节）

范围：-2,147,483,648 到 2,147,483,647

int64:

大小：64位（8字节）

范围：-9,223,372,036,854,775,808 到 9,223,372,036,854,775,807

选择使用 int32 还是 int64 取决于你需要表示的整数范围和内存使用的权衡。

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

torch.float 是 float32 一模一樣这里又个例子：记住float就是float32.因为这就是最常用的浮点数，如果你想他用float64. ok你只需要说： Tensor.double()

x_data = torch.randint(0, max_features, (6, 6, 1)) # int x =x_data.float()

还有一个别的例子：

x_data = torch.randint(0, 10, (6, 6, 1)) x_float32 = x_data.float() # 默认转换为 float32 x_float64 = x_data.double() # 转换为 float64

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

LSTM 从numpy 写scratch import numpy as np

class LSTM: def init(self, input_dim, hidden_dim): self.input_dim = input_dim self.hidden_dim = hidden_dim # 初始化权重 self.Wf = np.random.randn(hidden_dim, input_dim + hidden_dim) self.Wi = np.random.randn(hidden_dim, input_dim + hidden_dim) self.Wo = np.random.randn(hidden_dim, input_dim + hidden_dim) self.Wc = np.random.randn(hidden_dim, input_dim + hidden_dim) self.bf = np.zeros((hidden_dim, 1)) self.bi = np.zeros((hidden_dim, 1)) self.bo = np.zeros((hidden_dim, 1)) self.bc = np.zeros((hidden_dim, 1))

def forward(self, x, h_prev, c_prev):
    combined = np.concatenate((h_prev, x), axis=0)
    ft = self.sigmoid(np.dot(self.Wf, combined) + self.bf)
    it = self.sigmoid(np.dot(self.Wi, combined) + self.bi)
    ot = self.sigmoid(np.dot(self.Wo, combined) + self.bo)
    cct = np.tanh(np.dot(self.Wc, combined) + self.bc)
    ct = ft * c_prev + it * cct
    ht = ot * np.tanh(ct)
    return ht, ct

def sigmoid(self, x):
    return 1 / (1 + np.exp(-x))

示例

input_dim = 10 hidden_dim = 20 lstm = LSTM(input_dim, hidden_dim) x = np.random.randn(input_dim, 1) h_prev = np.zeros((hidden_dim, 1)) c_prev = np.zeros((hidden_dim, 1)) h, c = lstm.forward(x, h_prev, c_prev) print(h)

class Attention: def init(self, hidden_dim): self.hidden_dim = hidden_dim self.Wa = np.random.randn(hidden_dim, hidden_dim) self.ba = np.zeros((hidden_dim, 1))

def forward(self, hidden_states, h_current):
    scores = np.dot(self.Wa, hidden_states) + self.ba
    attention_weights = self.softmax(scores)
    context_vector = np.sum(attention_weights * hidden_states, axis=1, keepdims=True)
    return context_vector

def softmax(self, x):
    e_x = np.exp(x - np.max(x))
    return e_x / e_x.sum(axis=0)

示例

attention = Attention(hidden_dim) hidden_states = np.random.randn(hidden_dim, 10) # 假设有10个时间步长 h_current = np.random.randn(hidden_dim, 1) context_vector = attention.forward(hidden_states, h_current) print(context_vector)

import numpy as np

def forward(self, x, h_prev, c_prev):
    combined = np.concatenate((h_prev, x), axis=0)
    ft = self.sigmoid(np.dot(self.Wf, combined) + self.bf)
    it = self.sigmoid(np.dot(self.Wi, combined) + self.bi)
    ot = self.sigmoid(np.dot(self.Wo, combined) + self.bo)
    cct = np.tanh(np.dot(self.Wc, combined) + self.bc)
    ct = ft * c_prev + it * cct
    ht = ot * np.tanh(ct)
    return ht, ct

def sigmoid(self, x):
    return 1 / (1 + np.exp(-x))

示例

#lstm #numpy #scratch

🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️下一個筆記🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️🐿️

T81-558:Applications of Deep Neural Networks 是华盛顿大学的一门课

T81-558:Applications of Deep Neural Networks

Jeff Heaton

Washington University in St. Louis

#匯總今日內容