참고문헌 (Selected References)
🧒 참고문헌이에요
AI 발전에 기여한 주요 논문 38편. 더 깊이 공부하고 싶다면 이 논문들을 직접 찾아 읽어보세요. 특히 굵직한 것:
- Vaswani 2017 — Transformer 원전 ("Attention is All You Need")
- Brown 2020 — GPT-3
- Ho 2020 — Diffusion Model
- Ouyang 2022 — InstructGPT (ChatGPT의 기반)
- Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR.
- Bai, Y., et al. (2022). Constitutional AI: Harmlessness from AI Feedback. arXiv:2212.08073.
- Belkin, M., et al. (2019). Reconciling modern machine-learning practice and the classical bias-variance trade-off. PNAS.
- Bengio, Y., Courville, A., & Vincent, P. (2013). Representation Learning: A Review and New Perspectives. TPAMI.
- Bommasani, R., et al. (2021). On the Opportunities and Risks of Foundation Models. arXiv:2108.07258.
- Brown, T., et al. (2020). Language Models are Few-Shot Learners. NeurIPS.
- Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function. MCSS.
- Dao, T., et al. (2022). FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness. NeurIPS.
- Devlin, J., et al. (2019). BERT: Pre-training of Deep Bidirectional Transformers. NAACL.
- Goodfellow, I., et al. (2014). Generative Adversarial Nets. NeurIPS.
- He, K., et al. (2016). Deep Residual Learning for Image Recognition. CVPR.
- Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. NeurIPS.
- Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation.
- Hoffmann, J., et al. (2022). Training Compute-Optimal Large Language Models (Chinchilla). arXiv:2203.15556.
- Kaplan, J., et al. (2020). Scaling Laws for Neural Language Models. arXiv:2001.08361.
- Kingma, D., & Ba, J. (2015). Adam: A Method for Stochastic Optimization. ICLR.
- Kingma, D., & Welling, M. (2014). Auto-Encoding Variational Bayes. ICLR.
- Krizhevsky, A., Sutskever, I., & Hinton, G. (2012). ImageNet Classification with Deep CNNs. NeurIPS.
- LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep Learning. Nature, 521.
- Lewis, P., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP. NeurIPS.
- Lipman, Y., et al. (2023). Flow Matching for Generative Modeling. ICLR.
- McCulloch, W., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. Bull. Math. Biophys.
- Mnih, V., et al. (2015). Human-level control through deep reinforcement learning. Nature.
- Ouyang, L., et al. (2022). Training language models to follow instructions with human feedback. NeurIPS.
- Radford, A., et al. (2021). Learning Transferable Visual Models From Natural Language Supervision (CLIP). ICML.
- Rafailov, R., et al. (2023). Direct Preference Optimization. NeurIPS.
- Rombach, R., et al. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. CVPR.
- Rumelhart, D., Hinton, G., & Williams, R. (1986). Learning representations by back-propagating errors. Nature.
- Russell, S., & Norvig, P. (2020). Artificial Intelligence: A Modern Approach (4th ed.). Pearson.
- Schulman, J., et al. (2017). Proximal Policy Optimization Algorithms. arXiv:1707.06347.
- Silver, D., et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature.
- Song, Y., et al. (2021). Score-Based Generative Modeling through Stochastic Differential Equations. ICLR.
- Srivastava, N., et al. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. JMLR.
- Su, J., et al. (2021). RoFormer: Enhanced Transformer with Rotary Position Embedding. arXiv:2104.09864.
- Turing, A. (1950). Computing Machinery and Intelligence. Mind.
- Vapnik, V. (1998). Statistical Learning Theory. Wiley.
- Vaswani, A., et al. (2017). Attention Is All You Need. NeurIPS.
- Wei, J., et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS.
- Wei, J., et al. (2022). Emergent Abilities of Large Language Models. TMLR.