核心论文
AI领域核心论文,包括深度学习基础、大语言模型、智能体等
共 29 篇论文
深度学习基础
奠基性技术,包括AlexNet、ResNet、Transformer等核心论文
4 篇论文ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)
基于深度卷积神经网络的ImageNet分类(AlexNet)
2012
Attention Is All You Need (Transformer)
注意力机制是你所需要的全部(Transformer)
2017
Explaining and Harnessing Adversarial Examples
解释和利用对抗样本
2015
大语言模型
LLM核心技术,包括GPT系列、BERT、RLHF、思维链等
7 篇论文Improving Language Understanding by Generative Pre-Training (GPT-1)
通过生成式预训练改进语言理解(GPT-1)
2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT:用于语言理解的深度双向Transformer预训练
2018
Language Models are Few-Shot Learners (GPT-3)
语言模型是少样本学习者(GPT-3)
2020
Training Language Models to Follow Instructions with Human Feedback (RLHF)
通过人类反馈训练语言模型遵循指令(RLHF)
2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
思维链提示激发大语言模型的推理能力
2022
Training Compute-Optimal Large Language Models (Chinchilla)
训练计算最优的大型语言模型(Chinchilla)
2022
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (MoE架构)
Switch Transformers:通过简单高效的稀疏性扩展到万亿参数模型(MoE架构)
2021
大模型优化与部署
工程化落地,包括LoRA、模型压缩、量化部署、RAG等
4 篇论文LoRA: Low-Rank Adaptation of Large Language Models
LoRA:大型语言模型的低秩自适应
2021
DistilBERT: A Distilled Version of BERT—Smaller, Faster, Cheaper and Lighter
DistilBERT:BERT的蒸馏版本——更小、更快、更便宜、更轻便
2019
LLM.int8 (): 8-bit Matrix Multiplication for Transformers at Scale
LLM.int8():大规模Transformer的8位矩阵乘法
2022
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (RAG)
知识密集型NLP任务的检索增强生成(RAG)
2020
AI智能体
核心技术与框架,包括基础理论、LLM驱动智能体、多智能体协作等
11 篇论文多模态与跨领域突破
包括AlphaFold系列、Mamba等核心论文
3 篇论文Highly accurate protein structure prediction with AlphaFold 2
使用AlphaFold 2进行高精度蛋白质结构预测
2021
Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3
使用AlphaFold 3准确预测生物分子相互作用的结构
2024
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Mamba:基于选择性状态空间的线性时间序列建模
2024