[MIPS] Fix csum_partial_copy_from_user
I found that asm version of csum_partial_copy_from_user() introduced
in
e9e016815f264227b6260f77ca84f1c43cf8b9bd was less effective.
For csum_partial_copy_from_user() case, "both_aligned" 8-word copy/sum
loop block is skipped to handle LOAD failure properly, and 4-word
copy/sum block is not loop, thus we will loop at ineffective
"less_than_4units" block.
This patch re-arrange register usages so that t0-t7 can be used in
"both_aligned" loop. This makes "both_aligned" loop can be used for
copy_from_user case too. This patch also cleanup codes around entry
point.
Signed-off-by: Atsushi Nemoto <anemo@mba.ocn.ne.jp>
Signed-off-by: Ralf Baechle <ralf@linux-mips.org>