net: optimize checksum computation
Very simple loop optimization with a significant performance impact.
Microbenchmark results, modern x86-64:
buffer size | speed up
------------+---------
1500 | 1.7x
64 | 1.5x
8 | 1.15x
Microbenchmark results, POWER7:
buffer size | speed up
------------+---------
1500 | 5x
64 | 3.3x
8 | 1.13x
There is a lot of room for further improvement at the expense of
code complexity - aligned multibyte reads, LE/BE considerations,
architecture-specific optimizations, etc. This patch still keeps
things simple and readable.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>