Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum | Read Paper on Bytez