참고문헌

참고문헌 (Selected References)

🧒 참고문헌이에요

AI 발전에 기여한 주요 논문 38편. 더 깊이 공부하고 싶다면 이 논문들을 직접 찾아 읽어보세요. 특히 굵직한 것:

Vaswani 2017 — Transformer 원전 ("Attention is All You Need")
Brown 2020 — GPT-3
Ho 2020 — Diffusion Model
Ouyang 2022 — InstructGPT (ChatGPT의 기반)

Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR.
Bai, Y., et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073.
Belkin, M., et al. (2019). Reconciling modern machine-learning practice and the classical bias-variance trade-off. PNAS.
Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. TPAMI.
Bommasani, R., et al. (2021). On the Opportunities and Risks of Foundation Models. arXiv:2108.07258.
Brown, T., et al. (2020). Language Models are Few-Shot Learners. NeurIPS.
Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. MCSS.
Dao, T., et al. (2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. NeurIPS.
Devlin, J., et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers. NAACL.
Goodfellow, I., et al. (2014). Generative Adversarial Nets. NeurIPS.
He, K., et al. (2016). Deep Residual Learning for Image Recognition. CVPR.
Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. NeurIPS.
Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation.
Hoffmann, J., et al. (2022). Training Compute-Optimal Large Language Models (Chinchilla). arXiv:2203.15556.
Kaplan, J., et al. (2020). Scaling Laws for Neural Language Models. arXiv:2001.08361.
Kingma, D., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. ICLR.
Kingma, D., & Welling, M. (2014). Auto-Encoding Variational Bayes. ICLR.
Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet Classification with Deep CNNs. NeurIPS.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep Learning. Nature, 521.
Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP. NeurIPS.
Lipman, Y., et al. (2023). Flow Matching for Generative Modeling. ICLR.
McCulloch, W., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys.
Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature.
Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS.
Radford, A., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision (CLIP). ICML.
Rafailov, R., et al. (2023). Direct Preference Optimization. NeurIPS.
Rombach, R., et al. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. CVPR.
Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning representations by back-propagating errors. Nature.
Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). Pearson.
Schulman, J., et al. (2017). Proximal Policy Optimization Algorithms. arXiv:1707.06347.
Silver, D., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature.
Song, Y., et al. (2021). Score-Based Generative Modeling through Stochastic Differential Equations. ICLR.
Srivastava, N., et al. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. JMLR.
Su, J., et al. (2021). RoFormer: Enhanced Transformer with Rotary Position Embedding. arXiv:2104.09864.
Turing, A. (1950). Computing Machinery and Intelligence. Mind.
Vapnik, V. (1998). Statistical Learning Theory. Wiley.
Vaswani, A., et al. (2017). Attention Is All You Need. NeurIPS.
Wei, J., et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS.
Wei, J., et al. (2022). Emergent Abilities of Large Language Models. TMLR.

← 안전성·프런티어