In section 4.3.2 of this paper describing the BabyJubJub curve, an algorithm for computing the scalar multiple $[k]P$ is given, where $k$ is in arbitrarily large integer and $P$ is a point whose order is $251$ to $254$ bits on the BabyJubJub curve. In this algorithm, $k$ is 'split' into chunks of $248$ bits and the algorithm computes the scalar mulitple $[k]P$ using these chunks.
My question is the following - why is such an algorithm relevant to computing scalar multiples of points of around $251$-bit order? We know that the scalar multiple $[k]P$ is equivalent to $[k \text{ % order(}P)]P$ (here $\%$ is the modulo operator) and so any scalar multiple of such a point can be reduced to the scalar multiple $[k']P$ where $k'=k\text{ % order(}P)$ is around $251$ bits. How is an algorithm that splits $k$ up into chunks of $248$ bits and then computes the scalar multiple any better than simply computing $[k\text{ % order(}P)]P$ using the usual double-and-add algorithm?
Perhaps this algorithm is taken from somwhere else and is written here for the sake of having an algorithm present for computing scalar multiples on this curve in a way that uses arithmetic circuits.