Completed a CMU Advanced NLP assignment, where I developed a minimalist version of Llama2 by working with PyTorch and pretrained weights stories42M.pt (an 8-layer, 42M parameter language model pretrained on the [TinyStories] dataset). I learned about essential components of an LLM (rotary position embedding (RoPE), scaled dot product attention, AdamW optimizer) and sentence classification.
Back to Projects