Recheck: send batches, process results
Summary:
Once the workers are initialized for recheck (they loaded the base naming table, the naming table diff, and updated the repo to reflect the state that it's in on the user's host), they can get batches of work to process.
The lifecycle of a batch is:
* take some N jobs from the list of files-to-process:
* the size of the batch is controlled by the max_batch_size and min_batch_size config values; I consider the current heuristic to be very approximate and in need of further testing and measurement, and the values I chose for these 2 settings are somewhat arbitrary
* the files that are eligible for remote type checking are those that need to be type checked (as opposed to declared), and those that have not already been checked locally (deferred count must be = 0)
When the remote worker responds with results, the processing of the results consists of creating a MultiWorker job that consists of:
* importing the dependency graph that the remote worker produced as a result of rechecking the batch
* returning the results of the recheck (the recheck progress and the errors found, if any) for the `merge` MultiWorker function to consume
* if the batch failed on the remote worker (or if we fail to import the dependency graph), then the recheck progress would indicate that the entire batch "remains" to be checked (the files are returned back into the pool of files to be processed)
* if the batch succeeded, then all the files are counted as "completed"
Reviewed By: ljw1004
Differential Revision:
D19109185
fbshipit-source-id:
d3c68aa075bb3f4ef9f52b251dec8ffe012669b0