Deepseek's Janus Models
Multimodal understanding and generation from the same models.
Modern BERT
A Much-Needed Update to an ML Workhorse
Gemma Scope
Open Sparse Autoencoders Everywhere All At Once on Gemma 2
Large Concept Models
LLMs thinking in concepts, not just word by word
C4 Diagrams: A Guide to Visualizing Software Architecture
Making Engineering Diagrams speak for themselves