核心论文

AI领域核心论文,包括深度学习基础、大语言模型、智能体等

29 篇论文
29 篇论文

深度学习基础

奠基性技术,包括AlexNet、ResNet、Transformer等核心论文

4 篇论文

ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)

基于深度卷积神经网络的ImageNet分类(AlexNet)

2012

Deep Residual Learning for Image Recognition (ResNet)

用于图像识别的深度残差学习(ResNet)

2016

Attention Is All You Need (Transformer)

注意力机制是你所需要的全部(Transformer)

2017

Explaining and Harnessing Adversarial Examples

解释和利用对抗样本

2015

大语言模型

LLM核心技术,包括GPT系列、BERT、RLHF、思维链等

7 篇论文

Improving Language Understanding by Generative Pre-Training (GPT-1)

通过生成式预训练改进语言理解(GPT-1)

2018

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

BERT:用于语言理解的深度双向Transformer预训练

2018

Language Models are Few-Shot Learners (GPT-3)

语言模型是少样本学习者(GPT-3)

2020

Training Language Models to Follow Instructions with Human Feedback (RLHF)

通过人类反馈训练语言模型遵循指令(RLHF)

2022

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models

思维链提示激发大语言模型的推理能力

2022

Training Compute-Optimal Large Language Models (Chinchilla)

训练计算最优的大型语言模型(Chinchilla)

2022

Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity (MoE架构)

Switch Transformers:通过简单高效的稀疏性扩展到万亿参数模型(MoE架构)

2021

大模型优化与部署

工程化落地,包括LoRA、模型压缩、量化部署、RAG等

4 篇论文

LoRA: Low-Rank Adaptation of Large Language Models

LoRA:大型语言模型的低秩自适应

2021

DistilBERT: A Distilled Version of BERT—Smaller, Faster, Cheaper and Lighter

DistilBERT:BERT的蒸馏版本——更小、更快、更便宜、更轻便

2019

LLM.int8 (): 8-bit Matrix Multiplication for Transformers at Scale

LLM.int8():大规模Transformer的8位矩阵乘法

2022

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (RAG)

知识密集型NLP任务的检索增强生成(RAG)

2020

AI智能体

核心技术与框架,包括基础理论、LLM驱动智能体、多智能体协作等

11 篇论文

LLM驱动智能体

The Rise and Potential of Large Language Model Based Agents

基于大型语言模型的智能体的兴起与潜力

2023LLM驱动智能体

Advances and Challenges in Foundation Agents

基础智能体的进展与挑战

2025LLM驱动智能体

General Framework of AI Agents

AI智能体的通用框架

2026LLM驱动智能体

多模态与跨领域突破

包括AlphaFold系列、Mamba等核心论文

3 篇论文

Highly accurate protein structure prediction with AlphaFold 2

使用AlphaFold 2进行高精度蛋白质结构预测

2021

Accurate Structure Prediction of Biomolecular Interactions with AlphaFold 3

使用AlphaFold 3准确预测生物分子相互作用的结构

2024

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Mamba:基于选择性状态空间的线性时间序列建模

2024