[유튜브 강의 정리] 안드레 카파시 - Deep Dive into LLMs like ChatGPT

Introduction

Pre-Training

Step 1: Download and preprocess the internet

Step 2: Tokenization

Step 3: Neural network training

Step 4: Inference

Base model

Post-Training: Supervised Finetuning

Conversations

Hallucinations

Knowledge of Self

Models need tokens to think

Things the model cannot do well

Post-Training: Reinforcement Learning

Reinforcement learning

DeepSeek-R1

AlphaGo

Reinforcement learning from human feedback (RLHF)

Preview of things to come

Keeping track of LLMs

Where to find LLMs

Pre-Training

  • Step 1: 인터넷 데이터 다운로드 및 전처리

  • Step 2: 토큰화(Tokenization)

  • Step 3: 뉴럴 네트워크 훈련

  • Step 4: 추론(Inference)