Carlos Tasada [Mon, 22 Apr 2013 13:47:15 +0000 (22 15:47 +0200)]
Fixed path separator in classpath variable
Carlos Tasada [Thu, 18 Apr 2013 07:55:20 +0000 (18 09:55 +0200)]
Added voldemort-convert-bdb batch file to upgrade from 1.1.x pre-versions
Minor fixes in other batch files
Chinmay Soman [Mon, 15 Apr 2013 20:41:59 +0000 (15 13:41 -0700)]
Added Null pointer check in teardown of ReadOnlyStorageEngineTest
Chinmay Soman [Mon, 15 Apr 2013 18:33:41 +0000 (15 11:33 -0700)]
Added bigger timeout to testNoSlopsOnAllReplicaFailures test and doing better exception handling in HintedHandoffFailureTest
Chinmay Soman [Fri, 12 Apr 2013 01:20:57 +0000 (11 18:20 -0700)]
Adding another Hinted handoff failure test to ensure main thread returns with failure when all replicas dont respond
Vinoth Chandar [Fri, 12 Apr 2013 23:49:39 +0000 (12 16:49 -0700)]
Forklift tool fix to equally spread fetches
Vinoth Chandar [Thu, 11 Apr 2013 19:40:30 +0000 (11 12:40 -0700)]
Clarifying arbitrary choice to return BEFORE for equal vector clocks.
Vinoth Chandar [Thu, 11 Apr 2013 01:41:30 +0000 (10 18:41 -0700)]
Adding unit test for fork lift tool
Vinoth Chandar [Wed, 10 Apr 2013 01:05:00 +0000 (9 18:05 -0700)]
Tool to forklift data over for store migrations
Chinmay Soman [Fri, 5 Apr 2013 01:52:58 +0000 (4 18:52 -0700)]
Bug fixes to HintedHandoffFailureTest and added more tests to handle 3-2-2 config. Removed SleepyForceFailStore
Chinmay Soman [Thu, 4 Apr 2013 00:19:02 +0000 (3 17:19 -0700)]
Added unit tests to ensure that slops are registered for different asynchronous put operation failures
Zhongjie Wu [Wed, 3 Apr 2013 22:35:57 +0000 (3 15:35 -0700)]
make run-class.sh working in Mac per ctasada
https://github.com/voldemort/voldemort/issues/132
Chinmay Soman [Sat, 23 Mar 2013 00:38:58 +0000 (22 17:38 -0700)]
- Added GetAll and Delete implementations on the Coordinator (and the temporary rest client)
- Converted Coordinator into an AbstractService and added CoordinatorConfig
- Refactored Composite Voldemort request into different types
Chinmay Soman [Fri, 8 Mar 2013 23:59:12 +0000 (8 15:59 -0800)]
Moving thin client to contrib. Also fixing Benchmark to use DefaultStoreClient instead of Thin client
Chinmay Soman [Fri, 8 Mar 2013 22:27:17 +0000 (8 14:27 -0800)]
Adding the missing R2Store file
Chinmay Soman [Fri, 8 Mar 2013 22:23:49 +0000 (8 14:23 -0800)]
A working implementation of the Coordinator and thin client. Includes following things:
- Creating AbstractStore and AbstractStorageEngine to refactor
the corresponding Store and StorageEngine interfaces.
- Refactored the fat client to accomodate dynamic per call timeout.
- Isolated Fat client wrapper to safeguard multitenancy
- Autobootstrap mechanism added to the Coordinator service
- Basic HTTP request/response parsing and Error handling
Chinmay Soman [Sat, 2 Mar 2013 23:49:42 +0000 (2 15:49 -0800)]
Fixed .classpath which had an illegal entry
Chinmay Soman [Sat, 2 Mar 2013 23:31:41 +0000 (2 15:31 -0800)]
First working version of Coordinator service. Includes REST request and response handling, Error handling, automatic checking of metadata changes, fat client config management
Chinmay Soman [Fri, 15 Feb 2013 22:35:16 +0000 (15 14:35 -0800)]
Creating FatClientWrapper for Thread pool isolation.
Chinmay Soman [Thu, 1 Nov 2012 21:42:27 +0000 (1 14:42 -0700)]
Added the coordinator package. Modified Benchmark to use thin client
Chinmay Soman [Mon, 29 Oct 2012 23:22:56 +0000 (29 16:22 -0700)]
Basic working prototype for Coordinator and Sample Thin client
Abhinay Nagpal [Mon, 25 Mar 2013 20:19:33 +0000 (25 13:19 -0700)]
fixed run-class.sh to take absolute paths
Abhinay Nagpal [Mon, 25 Mar 2013 18:05:00 +0000 (25 11:05 -0700)]
Updated release number
Abhinay Nagpal [Mon, 25 Mar 2013 18:03:34 +0000 (25 11:03 -0700)]
Updated release notes
Abhinay Nagpal [Mon, 25 Mar 2013 17:39:10 +0000 (25 10:39 -0700)]
last commit did not pick up the actual fix for the test case-- checking ti in
Abhinay Nagpal [Mon, 25 Mar 2013 17:26:01 +0000 (25 10:26 -0700)]
Minor changes to test case# Please enter the commit message for your changes. Lines starting
Abhinay Nagpal [Fri, 22 Mar 2013 21:33:35 +0000 (22 14:33 -0700)]
Added test case to test runtime exceptions
Cleaned up the try catch logic
Abhinay Nagpal [Fri, 22 Mar 2013 00:54:56 +0000 (21 17:54 -0700)]
Adding test cases whch simulate intermittent exceptions etc
in the hdfsfetcher and simulate retry logic
this ensures checksum calculation is robust
Chinmay Soman [Thu, 21 Mar 2013 21:40:56 +0000 (21 14:40 -0700)]
Adding an extra catch block for Exception and Throwable types. This is used to catch the ClassNotFound exceptions
Chinmay Soman [Thu, 21 Mar 2013 17:34:12 +0000 (21 10:34 -0700)]
Using a per file checksum generator in the file copy in HdfsFetcher. This is used to handle the case where we might retry the copy in case of a Filesystem (hdfs) error.
Jay J Wylie [Fri, 22 Mar 2013 21:39:15 +0000 (22 14:39 -0700)]
Reduce chatter in logs while a node is unavailable (from INFO to DEBUG level output).
Jay J Wylie [Fri, 22 Mar 2013 16:56:22 +0000 (22 09:56 -0700)]
Fix to log4j set up in bin/run-class.sh that should correctly configure log4j.
Jay J Wylie [Wed, 20 Mar 2013 22:34:50 +0000 (20 15:34 -0700)]
Added pending release notes for this merge to
Jay J Wylie [Wed, 20 Mar 2013 20:52:54 +0000 (20 13:52 -0700)]
TODOs for refactoring and copyright header updates.
Jay J Wylie [Wed, 20 Mar 2013 18:01:13 +0000 (20 11:01 -0700)]
Addressed all code review comments for KeySampler and KeyVersionFetcher. Renamed many classes and methods related to FetchStreamRequestHandler.
- All sub-classes of FetchStreamRequestHandler have been renamed to have a more consistent nomenclature.
- Did some further refactoring in the FullScan* classes to move more work from leaf classes to FullScanFetchRequestHandler.java
- moved scan accounting to overall bae class
- Added getNodesPartitionIdForKey method to StoreInstance to help with some fetch logic
Jay J Wylie [Wed, 20 Mar 2013 17:30:36 +0000 (20 10:30 -0700)]
Interim commit to rename a bunch of files.
Jay J Wylie [Thu, 14 Mar 2013 21:55:10 +0000 (14 14:55 -0700)]
Addressed review feedback and TODOs for KeyVersionSamplerCLI (and renamed it to KeyVersionFetcherCLI).
- mostly usability changes about command line options...
- one copyright fix
Jay J Wylie [Thu, 14 Mar 2013 21:09:35 +0000 (14 14:09 -0700)]
Addressed all review feedback and TODOs for KeySamplerCLI
KeySamplerCLI
- added options: --store-names, --partition-ids, --keys-per-second-limit, and --progress-period-ops
- got rid of unnecessary (and weird) retry loop. Can add seomthing like that later if needed.
- pass all partitions to fetcher now instead of one-at-a-time
Also did cosmetic fixes for KeyVersionSamplerCLI and Entropy.java
Jay J Wylie [Thu, 14 Mar 2013 17:36:14 +0000 (14 10:36 -0700)]
Correctness fixes and significant refactoring of Fetch*StreamRequestHandlers. Expanded AdminFetchTest.
Added more common helper methods to common base class of all fetchers FetchStreamRequestHandler.
Added abstract base classes for partition-based fetching and non-partition-based fetching:
- FetchPartitionStreamRequestHandler (partition-based)
- FetchItemsStreamRequestHandler (non-partition-based)
Refactored some code up to abstract base classes and made implementations as similar as possible (without heroic efforts) across all fetchers:
- FetchEntriesStreamRequestHandler
- FetchKeysStreamRequestHandler
- FetchPartitionEntriesStreamRequestHandler
- FetchPartitionKeysStreamRequestHandler
Significant better test coverage in AdminFetchTest
- tests fetching keys as well as fetching entries
- tests partition-aware and non-partition-aware servers
- tests per-partition limits on entries/keys fetched
All of this clean up and additional testing lead to minor correctness fixes.
Minor other clean ups of comments, override annotations, and fixes for KeySamplerCLI.
Jay J Wylie [Wed, 13 Mar 2013 17:34:21 +0000 (13 10:34 -0700)]
change maxRecords to recordsPerPartition in fetch API and protobuf
These are cosmetic changes. The client-side and server-side code does not properly do recordsPerPartition yet.
Added a few TODOs in the code too.
Jay J Wylie [Wed, 13 Mar 2013 16:53:11 +0000 (13 09:53 -0700)]
remove skipRecords from fetching API and protobuf
AFAIK skipRecords was never used. By inspection, the code that would have been exercised if it had been used has never been correct. Removing skipRecords from the code base.
Also:
- Added a number of TODOs to the code from the reviews
- Changed some variable names
Jay J Wylie [Tue, 12 Mar 2013 19:57:22 +0000 (12 12:57 -0700)]
Minor fixes for tests that broke due to changes elsewhere in the code.
Jay J Wylie [Mon, 11 Mar 2013 23:03:20 +0000 (11 16:03 -0700)]
Minor fix for change to AdminClient
src/java/voldemort/client/protocol/admin/AdminClient.java
- do not close down AdminStoreClient from queryKeys
test/unit/voldemort/client/AdminServiceBasicTest.java
- added some additional checks to test to confirm (non)existence of exceptions&values
Jay J Wylie [Mon, 11 Mar 2013 21:46:59 +0000 (11 14:46 -0700)]
Many minor tweaks to ConsistencyFix code and related files to address minor review feedback.
build.xml
- fixed commenting out of 'protobuff' target
src/java/voldemort/client/protocol/admin/AdminClient.java
- add ClientConfig to constructor. This is needed for AdminStoreClient creation. It is confusing that we need both an AdminClientConfig and ClientConfig, but that is because the *ClientConfig code is so clumsy.
- changed ".stop()" methods to ".close()" to be consistent with other interfaces.
et cetera
- Updated all copyright notices that have changed on this branch since December. This touched a ton of files...
- annotated some TODOs with "(refactor)" to make refactoring todos easier to find.
Jay J Wylie [Fri, 8 Mar 2013 00:45:28 +0000 (7 16:45 -0800)]
Added unit tests for ConsistencyFix, ConsistencyFixWorker, and QueryKeyResult.
Many other fixes and cleanup:
src/java/voldemort/utils/ConsistencyFix.java
- tweak many variable names
- add close method to stop adminClient
- broke out BadKey to wrap a key with its string representation st failed fixes of badkey's can be dumped in full to file to be retried (without any additional effort)
- marked 'parseVersion' as deprecated since, if we do this again, we should dump bytes not strings
- track obsolete version exceptions and various statuses in Stats
src/java/voldemort/utils/ConsistencyFixCLI.java
- clean up of arguments, variable names, etc.
- cleanly close down fixer...
src/java/voldemort/utils/ConsistencyFixWorker.java
- more logger.trace output
- minor cleanup
test/common/voldemort/TestUtils.java
- added getVersioned() helper method
test/common/voldemort/config/stores.xml
- added consistency-fix store
test/unit/voldemort/store/routed/ReadRepairerTest.java
- marked all tests as @Test
test/unit/voldemort/utils/ConsistencyCheckTest.java
- update copyright notice
Jay J Wylie [Thu, 7 Mar 2013 18:48:19 +0000 (7 10:48 -0800)]
Actually adding files KeySamplerCLI and KeyVersionSamplerCLI.
Jay J Wylie [Thu, 7 Mar 2013 17:56:13 +0000 (7 09:56 -0800)]
Documented correct method of compiling protobuffs by hand.
Jay J Wylie [Thu, 7 Mar 2013 17:44:15 +0000 (7 09:44 -0800)]
Added KeySampler and KeyVersionSampler tools as a first step towards replacing "entropy" tool. Added another argument to bulk fetch operations that specifies maxRecords so that server can fetch a subset of a partition.
src/java/voldemort/utils/KeySamplerCLI.java
- Samples keys from a cluster
src/java/voldemort/utils/KeyVersionSamplerCLI.java
- Given file that lists keys per store, samples versions from each "responsible node" for that key
src/java/voldemort/client/protocol/admin/AdminClient.java
- passed maxRecords through
- TODO for future clean up of some types
src/java/voldemort/client/protocol/pb/VAdminProto.java
- auto generated!
src/java/voldemort/server/protocol/admin/AdminServiceRequestHandler.java
- white space
src/java/voldemort/server/protocol/admin/FetchStreamRequestHandler.java
src/java/voldemort/server/protocol/admin/FetchEntriesStreamRequestHandler.java
src/java/voldemort/server/protocol/admin/FetchKeysStreamRequestHandler.java
- handle maxRecords
src/java/voldemort/server/protocol/admin/FetchPartitionKeysStreamRequestHandler.java
src/java/voldemort/server/protocol/admin/FetchPartitionEntriesStreamRequestHandler.java
- handle maxRecords
- fixed usage of skipRecords
src/java/voldemort/utils/Entropy.java
- added maxRecords
src/proto/voldemort-admin.proto
- added mac_records to protobuff definition
test/unit/voldemort/client/AdminFetchTest.java
- added maxRecords field to test
Jay J Wylie [Mon, 4 Mar 2013 18:08:32 +0000 (4 10:08 -0800)]
Made rebalance --show-plan slightly more verbose and added yet another analysis for cluster balance ("zone primary").
src/java/voldemort/client/rebalance/RebalancePartitionsInfo.java
- print out hostname within plan to make it easier to read (rather than having to lookup node ID)
src/java/voldemort/utils/ClusterInstance.java
- calculate "zone primary" balance to understand which hosted partitions act as pseudo-master when zoned routing is used.
Jay J Wylie [Tue, 26 Feb 2013 19:07:25 +0000 (26 11:07 -0800)]
Review and cleanup of consistency checker.
- added required argument for an output file name for bad keys
- changed Reporter to print out 'just the key' to the output file; it
outputs more info at DEBUG level in general.
- removed 'quiet' option
- throw exceptions:
- if # partitions differ across clusters
- if replication factor is hinky
- if isExpired encounters unknown type
- main catches exceptions and fails fast
- changed system.out debugging to logger.trace
Jay J Wylie [Fri, 22 Feb 2013 19:37:34 +0000 (22 11:37 -0800)]
Added more info to cluster dump to track which nodes host 'hot' partitions.
Jay J Wylie [Fri, 22 Feb 2013 16:37:47 +0000 (22 08:37 -0800)]
Tweaked Rebalancer --output-dir again to better name interim metadata files for each batch.
Jay J Wylie [Fri, 22 Feb 2013 00:17:03 +0000 (21 16:17 -0800)]
Changed Rebalancer --output-dir option to append numbers to each .xml file it outputs so that we have access to interim cluster configs.
Zhongjie Wu [Thu, 21 Feb 2013 01:58:28 +0000 (20 17:58 -0800)]
Refactored Consistency Check
Jay J Wylie [Tue, 19 Feb 2013 21:42:08 +0000 (19 13:42 -0800)]
Added server-put tracking to progress bar.
Jay J Wylie [Wed, 13 Feb 2013 19:15:53 +0000 (13 11:15 -0800)]
Default to printing out BADKEYs from ConsistencyCheck. Cleaned up debug/trace messages in ConsistencyFix.java.
Jay J Wylie [Tue, 12 Feb 2013 17:11:17 +0000 (12 09:11 -0800)]
Fixed hashmap issues in AdminClient raised during code review. Added '--parse-only' option to ConsistencyFix.
src/java/voldemort/client/protocol/admin/AdminClient.java
- Added hashCode & equals methods to AdminClient.Nodestore
- cleaned up getSocketStore to not leak concurrently created socket stores.
src/java/voldemort/utils/ConsistencyFix(CLI).java
- added parse only flag which limits that actions of the fixer to bootstrapping and parsing the input file.
Jay J Wylie [Mon, 11 Feb 2013 22:30:17 +0000 (11 14:30 -0800)]
Added 'dry-run' option and cleaned up help message.
'--dry-run' option goes through all of the read paths (reading files, reading from servers) and calculates what to write where, but does not actually do any writes!
Should combine --dry-run with these log4j settings:
log4j.logger.voldemort.utils.ConsistencyFix=TRACE
log4j.logger.voldemort.utils.ConsistencyFixWorker=DEBUG
Jay J Wylie [Mon, 11 Feb 2013 21:58:40 +0000 (11 13:58 -0800)]
Code fixes for the fixing of orphans.
src/java/voldemort/utils/ConsistencyFix.java
- added .trace output for parsing of ugly input
- pass the correct key-type into constructor
src/java/voldemort/utils/ConsistencyFixWorker.java
- substantially more .debug output to trace operation
Jay J Wylie [Mon, 11 Feb 2013 16:34:55 +0000 (11 08:34 -0800)]
Missed afile when doing the hand merge of consistency check stuff.
Jay J Wylie [Mon, 11 Feb 2013 16:14:53 +0000 (11 08:14 -0800)]
Actually invoke the BadKeyOrphanReader. Also committing a bunch of
TODOs for later cleanup.
Jay J Wylie [Sun, 10 Feb 2013 22:40:28 +0000 (10 14:40 -0800)]
Added basic code for repairing orphaned key,values.
src/java/voldemort/utils/ConsistencyFix.java
- added BadKeyOrphanReader extends BadKeyReader to consume different
input file
src/java/voldemort/utils/ConsistencyFixCLI.java
- added "orphan-format" flag to indicate that the 'bad-key-file-in' is
of orphaned key/values.
src/java/voldemort/utils/ConsistencyFixWorker.java
- added constructor to take QueryKeyResult of orphaned keys
- modified resolveReadConflicts to add orphaned key/values to
imaginary nodes for the sake of determine the value/version to be
repaired
Jay J Wylie [Sat, 9 Feb 2013 23:41:38 +0000 (9 15:41 -0800)]
Added per-server throttling to the Consistency Fixer.
Added a map of EventThrottle objects such that repair traffic to each server can be throttled. We care about throttling write rate because of its potential impact on GC and cleaning.
Jay J Wylie [Sat, 9 Feb 2013 00:33:10 +0000 (8 16:33 -0800)]
Re-commiting a series of ZWu's commits to the consistency fix in a single batch. The other commits were not made against the same master and so this is easier than trying to figure out what went wrong in the merge/rebase.
Jay J Wylie [Fri, 8 Feb 2013 02:24:22 +0000 (7 18:24 -0800)]
Complete implementation of consistency fixer.
src/java/voldemort/utils/ConsistencyFix.java
- added Execute method that orchestrates all the threads
- switched pattern of thread execution:
- one thread for reading bad keys file that submits to ...
- a thread pool of workers that enqueues badkeys for the ...
- one thread writing still bad keys
- note: construct thread pool with a blocking queue
- switched to logging (rather than System.out/err)
- added Stats tracking
- moved methods that do complicated work (getting & repairing keys) out
src/java/voldemort/utils/ConsistencyFixWorker.java
- moved methods that do complicated work (getting & repairing keys) in
src/java/voldemort/utils/ConsistencyFixCLI.java
- added 'progress-bar' option
- got rid of 'verbose' option
- moved all thread orchestration to ConsistencyFix
Jay J Wylie [Thu, 7 Feb 2013 04:08:32 +0000 (6 20:08 -0800)]
Heavily refactored ConsistencyFix. Some key outstanding TODOs remain.
src/java/voldemort/utils/ConsistencyFix.java
- split out much functionality
- switched from having entirely static interfaces to being a normal non-static class
src/java/voldemort/utils/ConsistencyFixCLI.java
- split out CLI aspect from ConsistencyFix
src/java/voldemort/utils/ConsistencyFixKeyGetter.java
- split out thread for getting bad keys from ConsistencyFix
src/java/voldemort/utils/ConsistencyFixRepairPutter.java
- split out thread for repairing bad keys from ConsistencyFix
Jay J Wylie [Thu, 7 Feb 2013 02:43:03 +0000 (6 18:43 -0800)]
Parallelized the consistency fixer. Interim checkin. Need to refactor into separate files.
src/java/voldemort/utils/ConsistencyFix.java
- 1 thread for reading file of bad keys
- 1 thread for writing any bad keys that are not fixed
- thread pool for sending gets
- thread pool for sending puts
Jay J Wylie [Wed, 6 Feb 2013 00:02:57 +0000 (5 16:02 -0800)]
Separate simple admin ops from streaming ops and clean up ConsistencyFix
src/java/voldemort/client/protocol/admin/AdminClient.java
- separated simple StoerOperations from StreamingOperations
- note: renamed 'StreamingStoreOperations storeOps' to 'StreamingOperations streamingOps'
- pulled the exception handling logic out of queryKeys/repairEntries and reanmed to more generic names (getNodeKey and putNodeKeyValue).
src/java/voldemort/client/protocol/admin/QueryKeyResult.java
- made CTors public
src/java/voldemort/utils/ConsistencyFix.java
- significant clean up & refactoring
- match changes in voldemortadmintool
other
- some other TODOS
- other files touched due to renames within AdminClient
Jay J Wylie [Tue, 5 Feb 2013 21:30:16 +0000 (5 13:30 -0800)]
Added AdminStoreClient helper class to AdminClient.
src/java/voldemort/client/protocol/admin/AdminClient.java
- added AdminStoreClient which manages a ClientRequestExecutorPool and caches pertinent SocketStore (Store<ByteArray, byte[], byte[]>) objects. This allows the admin client to re-use connections when doing basic store operations (put, get, etc) against individual servers.
- added TODOs about other methods I think should be cleaned up
- significantly revised repairEntry
- added a queryKey method that uses the AdminStoreclient
src/java/voldemort/client/protocol/admin/RepairEntryResult.java
- return type for AdminClient.StreamingStoreOperations.repairEntry()
src/java/voldemort/utils/ConsistencyFix.java b/src/java/voldemort/utils/ConsistencyFix.java
- clean up to keep up with changes to AdminClient interfaces & types
- cleaned up TODOs
Jay J Wylie [Mon, 4 Feb 2013 18:04:02 +0000 (4 10:04 -0800)]
Fix errors introduced during rebase process.
Rebase with master was ugly. Picked up a bunch of AdminClient refactoring. These changes conflicted with many changes on the persistency-check branch. Had to fix the return type of AdminClient.queryKeys by hand (to iterator<QueryKeyResult>).
Added some TODO comments to AdminClient.HelperOperations methods that should move into ClusterInstance.
Also updated some copyright dates.
Jay J Wylie [Mon, 4 Feb 2013 16:36:05 +0000 (4 08:36 -0800)]
Minor comment/todo clean up.
Jay J Wylie [Thu, 31 Jan 2013 22:00:10 +0000 (31 14:00 -0800)]
Moved AdminClient.QueryKeyResult into its own file.
src/java/voldemort/client/protocol/admin/QueryKeyResult.java
- A more complete, proper class based on the inner class that had been in AdminClient.java
Jay J Wylie [Tue, 29 Jan 2013 22:19:06 +0000 (29 14:19 -0800)]
Added more TODOs from code review feedback.
Jay J Wylie [Tue, 29 Jan 2013 21:27:43 +0000 (29 13:27 -0800)]
Further refactoring.
src/java/voldemort/utils/ConsistencyFix.java
- rename inner class VoldemortInstance to ConsistencyFixContext
- Drop methods that were duplicated in StoreInstance
src/java/voldemort/utils/StoreInstance.java
- clean up comments for method getNodeIdListForPartitionIdList
Jay J Wylie [Tue, 29 Jan 2013 21:05:54 +0000 (29 13:05 -0800)]
Significant refactoring of util methods. Added helper classes ClusterInstance, StoreInstance, and StoreDefinitionUtil.
ClusterInstance wraps up a Cluster and List<StoreDefinition> object
and provides helper methods that operate on these objects together.
StoreInstance wraps up a Cluster and a StoreDefinition object
and provides helper methods that operate on these objects together.
StoreDefinitionUtil provides helper method that operate on either
StoreDefinition or List<StoreDefinition> objects.
Many methods have migrated out of RebalanceUtils into these new
classes.
Jay J Wylie [Tue, 29 Jan 2013 01:13:48 +0000 (28 17:13 -0800)]
Minor fix due to merging of persistency-check and rebalance-bug-fix branches
Jay J Wylie [Fri, 25 Jan 2013 19:07:51 +0000 (25 11:07 -0800)]
Minor tweak to dealing with contiguous runs of partitions.
- Hard code 10 repeated attempts to get rid of contiguous partitions.
- Added a TODO to do something perfect at some unspecified time in the future...
Jay J Wylie [Wed, 9 Jan 2013 17:23:40 +0000 (9 09:23 -0800)]
Fixed minor issue with tests in AdminServiceBasicTest. They were not including an appropriate zone list when constructing a cluster.
Jay J Wylie [Wed, 9 Jan 2013 16:07:44 +0000 (9 08:07 -0800)]
Update copyright notices.
Jay J Wylie [Wed, 9 Jan 2013 15:58:43 +0000 (9 07:58 -0800)]
Minor tweaks and refactoring
src/java/voldemort/utils/ClusterUtils.java
- added method to pretty print partition lists
- added methods to determine histogram of contiguous partition run lengths and pretty print
- moved verbose pretty printing of cluster details from ClusterRebalanceUtils to here
src/java/voldemort/utils/RebalanceClusterUtils.java
- randomized order in which zones are processed for shuffle algorithms
Jay J Wylie [Tue, 8 Jan 2013 21:29:09 +0000 (8 13:29 -0800)]
Module level refactor: split methods out of RebalanceUtils and RebalanceClusterUtils.java into new modules NodeUtils.java and ClusterUtils.java.
Any helper methods that operate on individual clusters or on individual (sets of) nodes are now in the appropriate util class.
Jay J Wylie [Tue, 8 Jan 2013 18:37:48 +0000 (8 10:37 -0800)]
Module level refactor : split bunch of stuff out of RebalanceUtils.java into new file RebalanceClusterUtils.java
All of the algorithms to generate a new cluster.xml that is (more) balanced has been moved into this new util class.
Jay J Wylie [Tue, 8 Jan 2013 17:35:46 +0000 (8 09:35 -0800)]
Addressed code review feedback on rebalance utility and cleaned up code
src/java/voldemort/client/rebalance/RebalanceCLI.java
- Changed pattern for boolean command line options to be cleaner
- Added more options to control when cross zone moves are checked
- Added more tests to confirm the specified sub-options for generate are mutually compatible
- Revised expansive secondary documentation of various options to cover all options
src/java/voldemort/client/rebalance/RebalanceClientConfig.java
- minor varaible rename
src/java/voldemort/cluster/Cluster.java
- add members and methods to track partitions & nodes by zone
src/java/voldemort/cluster/Node.java
- minor fix
src/java/voldemort/utils/RebalanceUtils.java
- refactored many of the methods I added to break them into smaller pieces
- by moving helper methods into Cluster.java, got rid of lots of redundant code
- cleaned up javadoc comments for the methods
- now prints out analysis for each try that improves balance
- validates num partitions for modified cluster same as original cluster
test/unit/voldemort/utils/RebalanceUtilsTest.java
- tests for some of the complicated/nuanced helper methods
Jay J Wylie [Wed, 12 Dec 2012 21:24:56 +0000 (12 13:24 -0800)]
Improvements to greedy swapping algorithm.
The completely greedy algorithm was too expensive. Have balanced the greedy approach with optional randomness & limits on number of swaps attempted in each round.
Exposed options all the way through the CLI.
Jay J Wylie [Wed, 12 Dec 2012 18:42:48 +0000 (12 10:42 -0800)]
Initial implementation of a greedy swapping algorithm.
Greedy swapping is unbelievably expensive because it tests every
possible pair-wise partition swap within a zone before making a single
swap. The goal is to get the most improvement possible with each swap
(to minimize the number of swaps total and so the amount of data
movement).
Jay J Wylie [Tue, 11 Dec 2012 21:22:14 +0000 (11 13:22 -0800)]
complete transition to max/min ratio for balancing cluster.
Jay J Wylie [Tue, 11 Dec 2012 17:13:24 +0000 (11 09:13 -0800)]
Tweaked rebalance swapping algorithm to improve effectiveness. Cleaned up code.
Changed logic of random swapping to iterate over zones while randomly swapping. Because of the dependencies between zones, this appears to work much better.
Cleaned up comments, TODOs, and so on to prepare for code review.
Jay J Wylie [Tue, 11 Dec 2012 16:35:28 +0000 (11 08:35 -0800)]
Added more rebalance options and refactored some of the rebalance utils code
- added random swap option that swaps random partitions between random nodes within a zone. This is a poor version of "simulated annealing". It produced some good results though in initial tests.
- Extended the cross zone partition move code that ensures there are no long contiguous runs of partitions within a single zone. Now, this code is followed by code that can balance the number of partitions in each zone. All of this code now runs in a loop until no further cross zone moves are performed. (A single pass of cross zone moves followed by balancing # of partitions per zone runs the risk of having contiguous partitions again.)
- Made most parameters that control these features command line options for the rebalancing --generate option.
Jay J Wylie [Fri, 7 Dec 2012 21:15:18 +0000 (7 13:15 -0800)]
Add option to limit the number of contiguous partitions within a zone.
- code to better balance partitions between zones is currently added
to the main rebalance code. this may not be the right place for this
code.
- TODO outstanding to figure out if the contig partitions manipulation
can be done and then the rest of rebalancing can
continue. Currently, the rest of the code gets broken and a separate
instantiion of rebalancing needs to be done.
Jay J Wylie [Fri, 7 Dec 2012 17:45:02 +0000 (7 09:45 -0800)]
Made analyze cluster option much more verbose. Dumps all partition maps and summarizes some aggregate partitions per node information.
Jay J Wylie [Fri, 7 Dec 2012 16:40:30 +0000 (7 08:40 -0800)]
Add many options to vary how generating partitions works.
- turn on/off check to keep primary partitions within same zone
- turn on/off check for other partitions that move zones
- specify amount of randomness among num partitions per node (w/in zone)
Jay J Wylie [Thu, 6 Dec 2012 18:51:44 +0000 (6 10:51 -0800)]
Re-wrote logic for re-balancing partitions in RebalanceUtils.generateMinCluster().
* Separates stealer nodes from donor nodes within each zone based on a goal of evenly distributing partitions within a zone across all nodes within that zone.
* Permits rebalancing logic to be run for a target cluster.xml that does not include any new nodes.
* Much more verbose output
* A few TODOs outstanding for review discussion
- Do multiple tries serve any purpose?
- Does the change that may make an existing node a stealer to smooth out the partition distribution break any other rebalance logic?
- Why does RebalanceUtils.getCrossZoneMoves sometimes return 0? This tool code be sped up by many orders of magnitude for large clusters if we could avoid calling RebalanceUtils.getCrossZoneMoves
Jay J Wylie [Thu, 6 Dec 2012 00:41:03 +0000 (5 16:41 -0800)]
Hack to better balance stealing/donation
Jay J Wylie [Wed, 5 Dec 2012 23:32:31 +0000 (5 15:32 -0800)]
Initial work on fixing out rebalance to more evenly distributed stealing and donating.
Jay J Wylie [Wed, 23 Jan 2013 16:31:41 +0000 (23 08:31 -0800)]
Cleaned up initial implementation of ConsistencyFix for preliminary code review.
src/java/voldemort/client/protocol/admin/AdminClient.java
- annotated repairEntry method with TODO items to get review feedback on
- Created new type QueryKeyResult to get rid of a struct of type Pair<ByteArray, Pair<List<Versioned<byte[]>>, Exception>>
src/java/voldemort/utils/ConsistencyFix.java
- added output file to list non-fixed keys in
- modulate output based on verbose flag
- refactored giant method(s) into smaller methods
- cleaned up TODO comments for review feedback
test/unit/voldemort/client/AdminServiceBasicTest.java
- uses QueryKeyResult type
src/java/voldemort/VoldemortAdminTool.java
- uses QueryKeyResult type
Jay J Wylie [Tue, 22 Jan 2013 00:48:46 +0000 (21 16:48 -0800)]
Added multiple-key options to ConsistencyFix tool.
src/java/voldemort/utils/ConsistencyFix.java
- Options for keys:
--key : single key
--keys : list of keys on command line
--key-file : file of keys, one per line.
keys are in hexadecimal.
- A bunch of TODOs for next step code cleanup, hardening, improving.
Jay J Wylie [Thu, 17 Jan 2013 21:59:39 +0000 (17 13:59 -0800)]
First step towards a consistency fixer tool.
src/java/voldemort/utils/ConsistencyFix.java
- CLI and initial implementation of fixer
- each invocation fixes one key for one store on one cluster
src/java/voldemort/client/protocol/admin/AdminClient.java
- added repairEntry method. Very much a work-in-progress.
src/java/voldemort/utils/ByteUtils.java
- added fromHexString helper function. (Sister function of toHexString)
test/unit/voldemort/utils/ByteUtilsTest.java
- improved tests for to/from HexString
Zhongjie Wu [Tue, 15 Jan 2013 21:41:00 +0000 (15 13:41 -0800)]
Updated ConsistencyCheck.java for new feature
Zhongjie Wu [Thu, 10 Jan 2013 06:07:31 +0000 (9 22:07 -0800)]
Added consistency check tool. Support single partition only
Zhongjie Wu [Tue, 29 Jan 2013 00:07:57 +0000 (28 16:07 -0800)]
Added non-zoned cluster(multiple bootstrap url) support for consistency check; This uses value hash for comparing values instead of version object