This explains why Falcon 40B outperforms LLaMA 33B on several benchmarks despite fewer parameters: cleaner data, not more compute.
This explains why Falcon 40B outperforms LLaMA 33B on several benchmarks despite fewer parameters: cleaner data, not more compute.
