Hyper-Connections 扩展因子的冗余性 Hyper-Connections 引入扩展因子 $n$,在每层维护超隐矩阵 $\mathbf{H}^l \in \mathbb{R}^{n \times d}$,更新规则为: \mathbf{H}^l = \mathbf{A}_r^{l\top} \mathbf{H}^{l-1} + \mathbf{B}^{l\top} \mathcal{T}_l(\mathbf{A}_m^{l\top} \m 2026-05-08 技术笔记 #深度学习 #Transformer #论文分析
Hello World Welcome to Hexo! This is your very first post. Check documentation for more info. If you get any problems when using Hexo, you can find the answer in troubleshooting or you can ask me on GitHub. Quick 2026-05-07