gpu: only add synchronization after copies from shared to global that require it