diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 42482045bf..5b2b96241b 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -358,6 +358,7 @@ include::cli:partial$cbcli/nav.adoc[] **** xref:rest-api:rest-cluster-removenode.adoc[Removing Nodes from Clusters] *** xref:rest-api:rest-rebalance-overview.adoc[Rebalance] + **** xref:rest-api:file-based-data-rebalance.adoc[] **** xref:rest-api:rest-retrieve-cluster-rebalance-reason-codes.adoc[Getting Rebalance Reason Codes] **** xref:rest-api:rest-cluster-rebalance.adoc[Rebalancing the Cluster] **** xref:rest-api:rest-get-rebalance-progress.adoc[Getting Rebalance Progress] diff --git a/modules/introduction/partials/new-features-81.adoc b/modules/introduction/partials/new-features-81.adoc index 58b3a29b1f..c25778857f 100644 --- a/modules/introduction/partials/new-features-81.adoc +++ b/modules/introduction/partials/new-features-81.adoc @@ -15,7 +15,29 @@ TBD Couchbase Server 8.1 introduces several new features for the Data Service. -TBD +==== File-Based Rebalance (FBR) + +Couchbase Server 8.1 introduces File-Based Rebalance (FBR) for the Data Service. +FBR accelerates cluster rebalance by copying vBucket storage files directly between nodes rather than streaming data through the DCP (Database Change Protocol) replication pipeline. +This eliminates the serialization and pipeline overhead of DCP backfill for large, disk-resident datasets. + +The following changes apply to the Data Service rebalance behavior in 8.1: + +* *File-Based Rebalance*: FBR transfers vBucket data files directly from the source node to the destination node during the backfill phase of a vBucket move, bypassing the full DCP backfill mechanism used in prior releases. + +* *Enabled by default*: FBR is enabled by default for Enterprise Edition, both for self-managed deployments and Couchbase Capella. +No configuration is required to activate it. + +* *Automatic rebalance type selection*: The server automatically determines whether FBR or DCP is more efficient for each vBucket move. +When FBR is not applicable or not expected to be faster, the server falls back to DCP automatically. + +* *New bucket-level rebalance type setting*: A new per-bucket setting, `dataServiceRebalanceType`, allows operators to control rebalance behavior at the bucket level, overriding the cluster-level FBR setting. + +* *Separate vBucket move concurrency for FBR*: A new setting, `dataServiceFileBasedRebalanceMovesPerNode`, controls the maximum number of concurrent file-based vBucket moves per node. +This is independent of the existing `rebalanceMovesPerNode` setting, which applies to DCP rebalance. + +NOTE: FBR is an Enterprise Edition feature. +Community Edition continues to use DCP-based rebalance for all vBucket moves. === Non-Data Services diff --git a/modules/learn/pages/clusters-and-availability/rebalance.adoc b/modules/learn/pages/clusters-and-availability/rebalance.adoc index 798acba9d7..4c719e322f 100644 --- a/modules/learn/pages/clusters-and-availability/rebalance.adoc +++ b/modules/learn/pages/clusters-and-availability/rebalance.adoc @@ -1,5 +1,5 @@ = Rebalance -:description: pass:q[_Rebalance_ redistributes data, indexes, event processing, and query processing among available nodes.] +:description: pass:q[Rebalance redistributes data, indexes, event processing, and query processing among available nodes.] :page-aliases: clustersetup:rebalance :page-toclevels: 3 @@ -9,31 +9,32 @@ [#understanding-rebalance] == Understanding Rebalance -When one or more nodes have been _brought into_ a cluster (either by xref:learn:clusters-and-availability/nodes.adoc#node-addition[adding] or xref:learn:clusters-and-availability/nodes.adoc#node-joining[joining]), or have been _taken out_ of a cluster (either through xref:learn:clusters-and-availability/removal.adoc[Removal] or xref:learn:clusters-and-availability/failover.adoc[Failover]), _rebalance_ redistributes data, indexes, event processing, and query processing among available nodes. -The _cluster map_ is correspondingly updated and distributed to clients. -The process occurs while the cluster continues to service requests for data. +When you add one or more nodes to a cluster by xref:learn:clusters-and-availability/nodes.adoc#node-addition[adding] or xref:learn:clusters-and-availability/nodes.adoc#node-joining[joining], or remove nodes by xref:learn:clusters-and-availability/removal.adoc[removal] or xref:learn:clusters-and-availability/failover.adoc[failover], you must rebalance the cluster. +Rebalance redistributes data, indexes, event processing, and query processing among available nodes. +Rebalance also updates the cluster map and distributes it to clients. +This process occurs while the cluster continues to service requests for data. -See xref:learn:clusters-and-availability/cluster-manager.adoc[Cluster Manager], for information on the cluster map. +See xref:learn:clusters-and-availability/cluster-manager.adoc[Cluster Manager], for information about the cluster map. See xref:manage:manage-nodes/node-management-overview.adoc[Manage Nodes and Clusters], for practical examples of using rebalance. [#rebalance-bucket-rank] == Bucket Rank -In Couchbase Server Version 7.6 and later, each bucket on the cluster (Couchbase or Ephemeral) can be assigned a _rank_. +In Couchbase Server Version 7.6 and later, each bucket on the cluster (Couchbase or Ephemeral) can be assigned a rank. The value is an integer from `0` (the default) to `1000`, inclusive. -Buckets with higher ranks are handled by the rebalance process _before_ buckets with lower ranks. -For example, if a cluster hosts four buckets, which are named _A_, _B_, _C_, and _D_; and bucket _A_ is explicitly assigned a rank of `10`, while buckets _B_, _C_, and _D_ are left with the default rank of `0`; when rebalance occurs, the vBuckets for bucket _A_ are addressed first; then, vBuckets for the other buckets are addressed, with the Cluster Manager making determinations as to the appropriate handling-order for those other buckets. +Buckets with higher ranks are handled by the rebalance process before buckets with lower ranks. +For example, if a cluster hosts four buckets, which are named A, B, C, and D; and bucket A is explicitly assigned a rank of `10`, while buckets B, C, and D are left with the default rank of `0`; when rebalance occurs, the vBuckets for bucket A are addressed first; then, vBuckets for the other buckets are addressed, with the Cluster Manager making determinations as to the appropriate handling-order for those other buckets. This assignment of `rank` allows a cluster's most mission-critical data to be rebalanced with top priority. -Bucket _rank_ can be established with either the CLI (see xref:cli:cbcli/couchbase-cli-bucket-create.adoc[bucket-create] and xref:cli:cbcli/couchbase-cli-bucket-edit.adoc[bucket-edit]) or the REST API (see xref:rest-api:rest-bucket-create.adoc[Creating and Editing Buckets]). +Bucket rank can be established with either the CLI (see xref:cli:cbcli/couchbase-cli-bucket-create.adoc[bucket-create] and xref:cli:cbcli/couchbase-cli-bucket-edit.adoc[bucket-edit]) or the REST API (see xref:rest-api:rest-bucket-create.adoc[Creating and Editing Buckets]). [#rebalance-stages] == Rebalance Stages -Each rebalance proceeds in sequential _stages_. +Each rebalance proceeds in sequential stages. Each stage corresponds to a Couchbase Service, deployed on the cluster. -Therefore, if all services have been deployed, there are _seven_ stages in all -- one each for the _Data_, _Query_, _Index_, _Search_, _Eventing_, _Backup_, and _Analytics_ services. +Therefore, if all services have been deployed, there are seven stages in all -- one each for the Data, Query, Index, Search, Eventing, Backup, and Analytics services. When all stages have been completed, the rebalance process itself is complete. [#rebalancing-the-data-service] @@ -55,8 +56,8 @@ See xref:learn:clusters-and-availability/intra-cluster-replication.adoc[Intra-Cl [#data-service-rebalance-phases] === Data-Service Rebalance Phases -During the Data Service rebalance stage, vBuckets are moved in _phases_. -The phases -- which differ, depending on whether the vBucket is an _active_ or a _replica_ vBucket -- are described below. +During the Data Service rebalance stage, vBuckets are moved in phases. +The phases -- which differ, depending on whether the vBucket is an active or a replica vBucket -- are described below. [#rebalance-phases-for-replica-vbuckets] ==== Rebalance Phases for Replica vBuckets @@ -65,17 +66,17 @@ The phases through which rebalance moves a replica vBucket are shown by the foll image::clusters-and-availability/replicaVbucketMove.png[,640,align=left] -The move has two principal phases. Phase 1 is _Backfill_. Phase 2 is _Book-keeping_. +The move has two principal phases. Phase 1 is Backfill. Phase 2 is Book-keeping. -Phase 1, _Backfill_, itself consists of two subphases. +Phase 1, Backfill, itself consists of two subphases. The first subphase comprises the movement of the replica vBucket data from its node of origin to the memory of the destination node. The second subphase comprises the writing of the replica vBucket data from the memory to the disk of the destination node. -The time required for this second subphase, which only applies to Couchbase Buckets, is termed _Persistence Time_. -The time required for the entire _Backfill_ process, including _Persistence Time_, is termed _Backfill Time_. +The time required for this second subphase, which only applies to Couchbase Buckets, is termed Persistence Time. +The time required for the entire Backfill process, including Persistence Time, is termed Backfill Time. -Phase 2, _Book-keeping_, comprises various ancillary tasks required for move-completion. +Phase 2, Book-keeping, comprises various ancillary tasks required for move-completion. -The total time required for the move is calculated by adding _Backfill Time_ to the time required for Phase 2, _Book-keeping_; and is termed _Move Time_. +The total time required for the move is calculated by adding Backfill Time to the time required for Phase 2, Book-keeping; and is termed Move Time. [#rebalance-phases-for-active-vbuckets] ==== Rebalance Phases for Active vBuckets @@ -85,23 +86,23 @@ The phases in which rebalance moves an active vBucket are shown by the following image::clusters-and-availability/activeVbucketMove.png[,780,align=left] The move has four principal phases. -Phase 1, _Backfill_, and Phase 2, _Book-keeping_, are identical to those required for replica vBuckets; except that the _Book-keeping_ phase includes additional _Persistence Time_. +Phase 1, Backfill, and Phase 2, Book-keeping, are identical to those required for replica vBuckets; except that the Book-keeping phase includes additional Persistence Time. -Phase 3, _Active Takeover_, comprises the operations required to establish the relocated vBucket as the new active copy. -The time required for Phase 3 is termed _Takeover Time_. +Phase 3, Active Takeover, comprises the operations required to establish the relocated vBucket as the new active copy. +The time required for Phase 3 is termed Takeover Time. -Phase 4, _Book-keeping_, comprises a final set of ancillary tasks, required for move-completion. +Phase 4, Book-keeping, comprises a final set of ancillary tasks, required for move-completion. -The total time for the move is termed _Move Time_. +The total time for the move is termed Move Time. [#limiting-concurrent-vbucket-moves] === Limiting Concurrent vBucket Moves -Since vBucket moves are highly resource-intensive, Couchbase Server allows the concurrency of such moves to be _limited_: a setting is provided that determines the maximum number of concurrent vBucket moves permitted on any node. +Since vBucket moves are highly resource-intensive, Couchbase Server allows the concurrency of such moves to be limited: a setting is provided that determines the maximum number of concurrent vBucket moves permitted on any node. The minimum value for the setting is `1`, the maximum `64`, the default `4`. -A _move_ counts toward this restriction only when in the _backfill_ phase, as described above, in xref:learn:clusters-and-availability/rebalance.adoc#data-service-rebalance-phases[Data Service Rebalance Phases]. -The move may be of either an _active_ or a _replica_ vBucket. +A move counts toward this restriction only when in the backfill phase, as described above, in xref:learn:clusters-and-availability/rebalance.adoc#data-service-rebalance-phases[Data Service Rebalance Phases]. +The move may be of either an active or a replica vBucket. A node's participation in the move may be as either a source or a target. For example, if a node is at a given time the source for two moves in backfill phase, and is the target for two additional moves in backfill phase, and the setting stands at `4`, the node may participate in the backfill phase of no additional moves, until at least one of its current moves has completed its backfill phase. @@ -113,10 +114,105 @@ Conversely, a lower setting may degrade rebalance performance, while freeing up Note, however, that rebalance performance can be affected by many additional factors; and that in consequence, changing this parameter may not always have the expected effects. Note also that a higher setting, due to its additional consumption of resources, may degrade the performance of other systems, including the Data Service. +[#file-based-rebalance] +=== File-Based Rebalance (FBR) + +File-Based Rebalance (FBR) for the Data Service copies the underlying Couchstore or Magma storage files directly from the source node to the destination node during the backfill phase of a vBucket move. +Copying the files reduces CPU overhead on both nodes and decouples rebalance time from item count, making rebalance time proportional to data size on disk rather than document count. + +NOTE: FBR is a Couchbase Server Enterprise Edition feature. +Couchbase Server Community Edition continues to use DCP-based rebalance for all vBucket moves. + +[#fbr-vs-dcp] +==== DCP Rebalance vs File-Based Rebalance + +Prior to Couchbase Server 8.1, the Data Service always rebalanced using DCP backfills. +A DCP backfill reads each piece of data from the vBucket on the source node. +It then transmits it over the network using the DCP streaming protocol to the destination node. +The destination node then writes the data into the target vBucket. +This approach is reliable, but its overhead is proportional to the number of items in the dataset. +For each document, the DCP backfill deserializes it, transmits it, and re-serializes it on the destination node. + +FBR replaces DCP backfill for eligible vBucket moves by copying the storage files directly, reducing CPU usage and improving network throughput. + +[width="100%",cols="25%,36%,39%",options="header"] +|=== +|Aspect |DCP Rebalance |File-Based Rebalance (FBR) + +|Transfer mechanism +|Stream documents through DCP pipeline +|Copy storage files directly over the network + +|Time scales with +|Number of items in the dataset +|Size of data on disk + +|CPU overhead +|Higher — serialization on source, deserialization on destination +|Lower — file copy with no document processing + +|Best suited for +|Small datasets, storage migration, ephemeral buckets +|Large disk-resident (DGM) datasets, swap rebalance, rebalance-in + +|Enterprise Edition only +|No — available in all editions +|Yes — EE only + +|Default in 8.1 +|Fallback when FBR is not applicable +|Default for all eligible vBucket moves +|=== + +[#fbr-backfill-takeover] +==== Backfill and Takeover Phases + +FBR applies only to the Backfill phase of a vBucket move, as described in xref:learn:clusters-and-availability/rebalance.adoc#data-service-rebalance-phases[Data Service Rebalance Phases]. +The Active Takeover phase always uses DCP, regardless of whether FBR was used for backfill. +Because takeover always uses DCP, the DCP rebalance infrastructure remains fully operational in 8.1. + +[#fbr-automatic-selection] +==== Automatic Rebalance Method Selection + +When you enable FBR, the server automatically selects the most efficient method for each vBucket move. +It uses FBR when it estimates the FBR would be at least 10% faster than DCP. +The server falls back to DCP automatically in the following situations: + +Storage engine migration between Couchstore and Magma:: +Migrating the storage format requires a full data reload, which is only possible through DCP. + +Eviction policy changes:: +Changing a bucket's eviction policy requires data to be reprocessed during rebalance, which requires DCP. + +Ephemeral buckets:: +Ephemeral buckets store data entirely in memory and have no persistent storage files for FBR to copy. + +Scenarios where DCP may be faster:: +For example, DCP can be faster when the data resident ratio is 100%. + +[#fbr-performance] +==== Performance + +The primary goal of FBR is to deliver significant improvements to rebalance speed for large datasets. +The target throughput is 1 TB of data movement in 30 minutes. +Rebalance time scales proportionally with the amount of data on disk and is independent of item count. +Throughput depends on the available network bandwidth, disk IOPS, and CPU resources on the participating nodes. + +NOTE: Workloads with lower resident ratios (data greater than memory) show the greatest benefit from FBR. + +[#fbr-concurrent-moves] +==== Limiting Concurrent File-Based vBucket Moves + +In Couchbase Server 8.1, FBR uses a separate concurrent-moves setting, `dataServiceFileBasedRebalanceMovesPerNode`, which is independent of the DCP `rebalanceMovesPerNode` setting. +The default value for both settings is `4`. +Changing one setting does not affect the other. + +The setting may be established by means of the xref:manage:manage-settings/general-settings.adoc#rebalance-settings[Couchbase Web Console] or the xref:manage:manage-settings/general-settings.adoc#rebalance-settings-via-rest[REST API]. + [#rebalance-reporting] === Accessing Rebalance Reports -Couchbase Server creates a _report_ on every rebalance that occurs. +Couchbase Server creates a report on every rebalance that occurs. The report contains a JSON document, which can be inspected in any browser or editor. The document provides summaries of the concluded rebalance activity, as well as details for each of the vBuckets affected: in consequence, the report may be of considerable length. @@ -126,8 +222,8 @@ On conclusion of a rebalance, its report can be accessed in any of the following * By means of the REST API, as described in xref:rest-api:rest-get-cluster-tasks.adoc[Getting Cluster Tasks]. -* By accessing the directory `/opt/couchbase/var/lib/couchbase/logs/rebalance` on _any_ of the cluster nodes. -A rebalance report is maintained here for (up to) the last _five_ rebalances performed. +* By accessing the directory `/opt/couchbase/var/lib/couchbase/logs/rebalance` on any of the cluster nodes. +A rebalance report is maintained here for (up to) the last five rebalances performed. Each report is provided as a `*.json` file, whose name indicates the time at which the report was run -- for example, `rebalance_report_2020-03-17T11:10:17Z.json`. A complete account of the report-content is provided in the xref:rebalance-reference:rebalance-reference.adoc[Rebalance Reference]. @@ -156,14 +252,14 @@ For more information about the rebalance operation on Index Service, see xref:le The Search Service automatically partitions its indexes across all Search nodes in the cluster, ensuring optimal distribution, following rebalance. -To achieve this, in versions of Couchbase Server prior to 7.1, by default, partitions needing to be newly created were entirely _built_, on their newly assigned nodes. -In 7.1 and later versions, by default, new partitions are instead created by the _transfer_ of partition files from old nodes to new nodes: this significantly enhances performance. -This is an Enterprise-only feature, which requires all Search Service nodes _either_ to be running 7.1 or later; _or_ to be running 7.0.2, with the feature explicitly switched on. +To achieve this, in versions of Couchbase Server prior to 7.1, by default, partitions needing to be newly created were entirely built, on their newly assigned nodes. +In 7.1 and later versions, by default, new partitions are instead created by the transfer of partition files from old nodes to new nodes: this significantly enhances performance. +This is an Enterprise-only feature, which requires all Search Service nodes either to be running 7.1 or later; or to be running 7.0.2, with the feature explicitly switched on. Community Edition clusters that are upgraded to Enterprise Edition 7.1 and later versions thus gain this feature in its default setting. Community Edition clusters that are upgraded to Enterprise Edition 7.0.2 can have this feature switched on, subsequent to upgrade. -During file transfer, should an unresolvable error occur, file transfer is automatically abandoned, and _partition build_ is used instead. +During file transfer, should an unresolvable error occur, file transfer is automatically abandoned, and partition build is used instead. The file-transfer feature can be enabled and disabled by means of the REST API. See xref:fts-rest-manage:index.adoc[Search Manager Options]. @@ -190,26 +286,26 @@ If needed, you can retry these requests on another Query node that is still in t [#rebalancing-the-eventing-service] === Eventing Service -When an Eventing Service node has been added or removed, rebalance causes the mutation (_vBucket_ processing ownership) and timer event processing workload to be redistributed among available Eventing Service nodes. +When an Eventing Service node has been added or removed, rebalance causes the mutation (vBucket processing ownership) and timer event processing workload to be redistributed among available Eventing Service nodes. The Eventing Service continues to process mutations both during and after rebalance. Checkpoint information ensures that no mutations are lost. [#rebalancing-the-analytics-service] === Analytics Service -The Analytics Service uses _shadow data_, which is a copy of all or some of the data maintained by the Data Service. +The Analytics Service uses shadow data, which is a copy of all or some of the data maintained by the Data Service. By default, the shadow data is not replicated; however, it may be partitioned across all cluster nodes that run the Analytics Service. Starting with Couchbase Server 7.1, the shadow data and its partitions may be replicated up to 3 times. Each replica resides on an Analytics node: a given Analytics node can host a replica partition, or the active partition on which replicas are based. -If there are _no_ Analytics replicas, and an Analytics node fails over, the Analytics Service stops working cluster-wide: ingestion of shadow data stops and no Analytics operations can be run. +If there are no Analytics replicas, and an Analytics node fails over, the Analytics Service stops working cluster-wide: ingestion of shadow data stops and no Analytics operations can be run. In this case: * If the Analytics node is recovered, the Analytics Service is resumed and ingestion of shadow data resumes from the point before the node failed over. * If the Analytics node is removed, the Analytics Service becomes active again after rebalance, but ingestion of shadow data must begin again from scratch. -If there _are_ Analytics replicas, and an Analytics node fails over, the Analytics Service continues to work: one of the replicas is promoted to serve the shadow data that was stored on the failed over node. +If there are Analytics replicas, and an Analytics node fails over, the Analytics Service continues to work: one of the replicas is promoted to serve the shadow data that was stored on the failed over node. The Analytics Service only needs to rebuild any shadow data that isn't already ingested from the Data Service, depending on the state of the promoted replica. In this case: @@ -218,7 +314,7 @@ In this case: * If the Analytics node is removed, the shadow data is redistributed among the remaining Analytics nodes in the cluster. If no Analytics Service node has been removed or replaced, shadow data is not affected by rebalance. -In consequence of rebalance, the Analytics Service receives an updated _cluster map_, and continues to work with the modified vBucket-topology. +In consequence of rebalance, the Analytics Service receives an updated cluster map, and continues to work with the modified vBucket-topology. [#rebalancing-the-backup-service] === Backup Service @@ -227,12 +323,12 @@ A rebalance causes the scheduler for the Backup Service to stop running. This means that no new backup tasks are triggered until the rebalance has concluded; at which point, the scheduler restarts, and reconstructs the task schedule. Then, the triggering of Backup Service tasks is resumed. -Note that a rebalance has the effect of _restarting_ the Backup Service whenever the service has previously been stopped, due to loss of its _leader_: for information, see xref:learn:services-and-indexes/services/backup-service.adoc#backup-service-architecture[Backup-Service Architecture]. +Note that a rebalance has the effect of restarting the Backup Service whenever the service has previously been stopped, due to loss of its leader: for information, see xref:learn:services-and-indexes/services/backup-service.adoc#backup-service-architecture[Backup-Service Architecture]. [#rebalance-failure-handling] == Rebalance Failure-Handling -Rebalance failures can optionally be responded to automatically, with up to 3 _retries_. +Rebalance failures can optionally be responded to automatically, with up to 3 retries. The number of seconds required to elapse between retries can also be configured. For information on configuration options, see xref:manage:manage-settings/general-settings.adoc[General Settings]. For information on failure-notifications, and options for cancelling rebalance-retries, see xref:manage:manage-nodes/add-node-and-rebalance.adoc#automated-rebalance-failure-handling[Automated Rebalance Failure Handling]. diff --git a/modules/manage/pages/manage-nodes/add-node-and-rebalance.adoc b/modules/manage/pages/manage-nodes/add-node-and-rebalance.adoc index d323773ea5..b7f9bf9619 100644 --- a/modules/manage/pages/manage-nodes/add-node-and-rebalance.adoc +++ b/modules/manage/pages/manage-nodes/add-node-and-rebalance.adoc @@ -186,6 +186,10 @@ Note that the figure in the *Items* column for node `10.142.181.101` is `31.5 K/ The figure for `10.142.181.102` indicates the converse. Therefore, replication has successfully distributed the contents of `travel-sample` across both nodes, providing a single replica vBucket for each active vBucket. +NOTE: By default, Couchbase Server Enterprise Edition automatically uses File-Based Rebalance (FBR) to move data for eligible vBuckets during node addition. +The server selects the optimal rebalance method for each vBucket move transparently. +For information, see xref:learn:clusters-and-availability/rebalance.adoc#file-based-rebalance[File-Based Rebalance (FBR)]. + [#node-information-within-the-ui] ==== Node Information Within the UI diff --git a/modules/manage/pages/manage-settings/general-settings.adoc b/modules/manage/pages/manage-settings/general-settings.adoc index cd5b2601f4..f0abb92172 100644 --- a/modules/manage/pages/manage-settings/general-settings.adoc +++ b/modules/manage/pages/manage-settings/general-settings.adoc @@ -193,6 +193,16 @@ The *Max moves per node during rebalance* option establishes the maximum number The minimum value for the parameter is `1`, the maximum `64`, the default `4`. For information, see xref:learn:clusters-and-availability/rebalance.adoc#limiting-concurrent-vbucket-moves[Limiting Concurrent vBucket Moves]. +Couchbase Server Enterprise Edition clusters have an additional control named *Maximum Concurrent vBucket Moves (File-Based)*. +It controls how many vBucket files a node can copy concurrently when performing a File-Based Rebalance (FBR). +The range for this setting is from `1` to `1024`. +This setting only applies to FBR-based vBucket moves and is independent of the DCP concurrent-moves setting above. + +The *Retry rebalance* option applies to both DCP and FBR rebalance types. +No separate configuration is required; retry behavior is the same regardless of which rebalance method was used for the failed attempt. + +For information on FBR, see xref:learn:clusters-and-availability/rebalance.adoc#file-based-rebalance[File-Based Rebalance (FBR)]. + [#data-settings] === Data Settings The fields that appear when you expand the *Advanced Data Settings* section let you control filesystem use limits and I/O thread allocation. diff --git a/modules/rest-api/pages/file-based-data-rebalance.adoc b/modules/rest-api/pages/file-based-data-rebalance.adoc new file mode 100644 index 0000000000..23c078914c --- /dev/null +++ b/modules/rest-api/pages/file-based-data-rebalance.adoc @@ -0,0 +1,217 @@ += Configure File-Based Data Rebalance +:description: pass:q[You control Data Service File-Based Rebalance (FBR) using the `/internalSettings` REST API endpoint.] +:page-edition: Enterprise Edition +:page-topic-type: reference +:page-toclevels: 3 + +[abstract] +{description} + +== Description + +The file-based rebalance option for the Data Service directly copies vBucket data files between nodes during rebalance. +This usually improves rebalance speed compared to DCP-based backfill method. +See xref:learn:clusters-and-availability/rebalance.adoc#file-based-rebalance[File-Based Rebalance] for more information about file-based rebalance and how it compares to DCP-based rebalance. + +== HTTP Methods + +This API endpoint supports the following methods: + +* <<#get-settings>> +* <<#set-settings>> + +[#get-settings] +== Get FBR Settings + +Get the current values of the FBR settings. +The FBR settings are part of the internal settings, so this API returns all internal settings. + +.List All Internal Settings +---- +GET /internalSettings +---- + +=== curl Syntax + +[source,bash] +---- +curl -u ${USER}:${PASSWORD} -X GET \ + http[s]://${HOST}:${PORT}/internalSettings \ + | jq 'with_entries(select(.key | startswith("dataService")))' +---- + +.Path Parameters +:priv-link: get-privs +include::partial$user_pwd_host_port_params.adoc[] + + +[#get-privs] +=== Required Privileges + +You must have at least 1 of the following roles to get the FBR settings: + +* xref:learn:security/roles.adoc#full-admin[Full Admin] +* xref:learn:security/roles.adoc#cluster-admin[Cluster Admin] +* xref:learn:security/roles.adoc#ro-security-admin[Read-Only Security Admin] +* xref:learn:security/roles.adoc#security-admin[Security Admin] +* xref:learn:security/roles.adoc#local-user-security-admin[Local User Admin] + +=== Responses + +`200 OK`:: +Returns the internal settings. +See examples for an example of the settings. + +`401 Unauthorized`:: +Returned when authentication fails, such as when the password is incorrect. + +`403 Forbidden`:: +Returned if you do not have 1 of the roles listed in <>. + +[#get-settings-example] +=== Examples + +The following example gets the current values of the FBR settings. +It filters the internal settings using the `jq` command to show only the FBR settings. + +[source,bash] +---- +curl -s -u Administrator:password \ + -X GET http://node1.example.com:8091/internalSettings \ + | jq 'with_entries(select(.key | startswith("dataService")))' +---- + +Running the example returns a JSON object containing the current FBR settings, such as the following: + +[source,json] +---- +{ + "dataServiceFileBasedRebalanceEnabled": true, + "dataServiceFileBasedRebalanceMovesPerNode": 4 +} +---- + +The keys returned by the previous example are the FBR settings you can use to control FBR behavior: + +dataServiceFileBasedRebalanceEnabled:: +When `true`, eligible vBucket moves use FBR; when `false`, all vBucket moves use DCP. + +dataServiceFileBasedRebalanceMovesPerNode:: +The maximum number of concurrent file-based vBucket moves per node during rebalance. + + +[#set-settings] +== Configure FBR + +You can change the FBR settings using the REST API. + +.Change FBR Settings +---- +POST /internalSettings +---- + +=== curl Syntax + +[source,bash] +---- +curl -sS -u ${USER}:${PASSWORD} \ + -X POST http[s]://${HOST}:${PORT}/internalSettings \ + [-d dataServiceFileBasedRebalanceEnabled={true|false}] \ + [-d dataServiceFileBasedRebalanceMovesPerNode=] +---- + +=== Path Parameters +:priv-link: settings-privs +include::partial$user-pw-host-port-params.adoc[] + +=== Parameters + +`dataServiceFileBasedRebalanceEnabled`:: +Set to `true` (the default) to allow the Data Service to use FBR when rebalancing data in eligible vBuckets. +Set to `false` to force the Data Service to use DCP when rebalancing data. + ++ +NOTE: This setting can be overridden on a per-bucket basis using the bucket's `dataServiceRebalanceType` setting. +See xref:manage:manage-buckets/edit-bucket.adoc#bucket-rebalance-type[Bucket-Level Rebalance Type] for more information. + +`dataServiceFileBasedRebalanceMovesPerNode`:: +Integer value that sets the maximum number of concurrent file-based vBucket moves per node during rebalance. +Valid range of values is from `1` to `1024`. +The default value is `4`. +Increase this value when you perform a rebalance when your cluster has low loads. +For example, consider increasing this value during scheduled maintenance windows. +Decrease this value to reduce the overhead of rebalance on running workloads. + ++ +NOTE: This setting is independent of `rebalanceMovesPerNode`, which applies to DCP rebalance. + +[#settings-privs] +=== Required Privileges + +You must have 1 of the following roles to change the FBR settings: + +* xref:learn:security/roles.adoc#full-admin[Full Admin] +* xref:learn:security/roles.adoc#cluster-admin[Cluster Admin] + +=== Responses + +`200 OK`:: +Updating the value or values succeeded. + +`400 Bad Request`:: +Returned if you specify an invalid value for either setting or an out-of-range value for `dataServiceFileBasedRebalanceMovesPerNode`. + ++ +Also returns a JSON object that describes the error, such as the following: + ++ +[source,json] +---- +{ + "errors": [ + "dataServiceFileBasedRebalanceMovesPerNode - The value must be between 1 and 1024." + ] +} +---- + +`401 Unauthorized`:: +Returned when authentication fails, such as when the password is incorrect. + +`403 Forbidden`:: +Returned if you do not have the proper roles to call this API. +See <<#settings-privs>>. + + +[#settings-examples] +=== Examples + +[#disable-fbr-example] +.Disable FBR + +The following example prevents the Data Service from using FBR cluster-wide: + +[source,bash] +---- +curl -s -u Administrator:password \ + -X POST http://node1.example.com:8091/internalSettings \ + -d dataServiceFileBasedRebalanceEnabled=false +---- + +.Set the Maximum Concurrent FBR Moves per Node + +The following example sets the maximum number of concurrent file-based vBucket moves per node to `8`: + + + +[source,bash] +---- +curl -s -u Administrator:password \ + -X POST http://node1.example.com:8091/internalSettings \ + -d dataServiceFileBasedRebalanceMovesPerNode=8 +---- + +=== See Also + +* For conceptual information, see xref:learn:clusters-and-availability/rebalance.adoc#file-based-rebalance[File-Based Rebalance (FBR)]. +* For UI configuration, see xref:manage:manage-settings/general-settings.adoc#rebalance-settings[Rebalance Settings]. +* For per-bucket FBR configuration, see xref:manage:manage-buckets/edit-bucket.adoc#bucket-rebalance-type[Bucket-Level Rebalance Type]. \ No newline at end of file diff --git a/modules/rest-api/pages/rest-bucket-create.adoc b/modules/rest-api/pages/rest-bucket-create.adoc index 8b15cfcf28..ecb7af072b 100644 --- a/modules/rest-api/pages/rest-bucket-create.adoc +++ b/modules/rest-api/pages/rest-bucket-create.adoc @@ -94,9 +94,10 @@ curl -X POST -u : -d warmupBehavior=[ background | blocking | none ] -d memoryLowWatermark= -d memoryHighWatermark= + -d dataServiceRebalanceType=[ auto | preferFileBased | preferDcp ] ---- -All parameters are described in the following subsections. +The following subsections describe these parameters. NOTE: The `threadsNumber` parameter, which sets the number of threads for the bucket, has not had any effect since version Couchbase Server 7.0.0. It's deprecated and is no longer listed in the syntax. @@ -139,6 +140,7 @@ All other parameters are optional and have a default value. ** <> ** <> ** <> +** <> + NOTE: When migrating a bucket between storage backends, you can edit only the bucket's xref:rest-api:rest-bucket-create.adoc#ramQuota[ramQuota], xref:rest-api:rest-bucket-create.adoc#evictionpolicy[evictionPolicy], and xref:rest-api:rest-bucket-create.adoc#storagebackend[storageBackend] parameters. For more information, see xref:manage:manage-buckets/migrate-bucket.adoc[]. @@ -600,6 +602,60 @@ curl -v -X POST http://localhost:8091/pools/default/buckets/testBucket / Success returns `200 OK`, and changes the `rank` of `testBucket` to `200`. +[#dataservicerebalancetype] +=== dataServiceRebalanceType +[.edition]#{enterprise}# + +Controls the rebalance method used for this bucket's vBucket moves during Data Service rebalance. +This setting overrides the cluster-level `dataServiceFileBasedRebalanceEnabled` setting for the bucket you set it on. + +The valid values are: + +* `auto` (default): The server automatically selects File-Based Rebalance (FBR) or DCP for each vBucket move based on which is estimated to be at least 10% faster. +This is the recommended setting for most workloads. + +* `preferFileBased`: FBR is used for all eligible vBucket moves. +DCP is used only when required, for example during storage engine migration or eviction policy changes. + +* `preferDcp`: All vBucket moves for this bucket use DCP, regardless of the cluster-level FBR setting. +Couchbase Server never uses FBR for the bucket when you select this setting. + +You can set this parameter at bucket creation and modify it after bucket creation. + +See xref:learn:clusters-and-availability/rebalance.adoc#file-based-rebalance[File-Based Rebalance (FBR)] for more information about FBR and how the `dataServiceRebalanceType` bucket-level setting interacts with the cluster-level `dataServiceFileBasedRebalanceEnabled` setting. + +[#example-dataservicerebalancetype-create] +==== Example: Specifying a Rebalance Type During Bucket Creation + +The following creates a bucket named `testBucket` with `preferFileBased` rebalance. + +[source,bash] +---- +curl -v -X POST http://localhost:8091/pools/default/buckets \ + -u Administrator:password \ + -d name=testBucket \ + -d ramQuota=125 \ + -d dataServiceRebalanceType=preferFileBased +---- + +If the call is successful, `202 Accepted` is returned. + +[#example-dataservicerebalancetype-edit] +==== Example: Updating the Rebalance Type + +The following updates the rebalance type for `testBucket` to `auto`: + +[source,bash] +---- +curl -v -X POST http://localhost:8091/pools/default/buckets/testBucket \ + -u Administrator:password \ + -d dataServiceRebalanceType=auto +---- + +Success returns `200 OK`. + +For conceptual information on FBR and the bucket-level setting, see xref:learn:clusters-and-availability/rebalance.adoc#file-based-rebalance[File-Based Rebalance (FBR)] and xref:manage:manage-buckets/edit-bucket.adoc#bucket-rebalance-type[Bucket-Level Rebalance Type]. + [#replicanumber] === replicaNumber diff --git a/modules/rest-api/pages/rest-rebalance-overview.adoc b/modules/rest-api/pages/rest-rebalance-overview.adoc index ebbe8b5c2c..ab77e5f837 100644 --- a/modules/rest-api/pages/rest-rebalance-overview.adoc +++ b/modules/rest-api/pages/rest-rebalance-overview.adoc @@ -1,5 +1,5 @@ = Rebalance -:description: pass:q[When one or more nodes have been brought into or taken out of a cluster, _rebalance_ redistributes data, indexes, event processing, and query processing among available nodes.] +:description: pass:q[When one or more nodes have been brought into or taken out of a cluster, rebalance redistributes data, indexes, event processing, and query processing among available nodes.] :page-topic-type: reference [abstract] @@ -8,7 +8,7 @@ Rebalance can be performed and configured by means of the REST API. == APIs in this Section -_Rebalance_ must be performed whenever the number of nodes in a cluster have changed, and whenever buckets have been added or removed. +Rebalance must be performed whenever the number of nodes in a cluster have changed, and whenever buckets have been added or removed. A complete overview is provided in xref:learn:clusters-and-availability/rebalance.adoc[Rebalance]. The REST API for rebalance is as follows: diff --git a/modules/rest-api/partials/rest-rebalance-table.adoc b/modules/rest-api/partials/rest-rebalance-table.adoc index 3bb6fb516e..b284ee1719 100644 --- a/modules/rest-api/partials/rest-rebalance-table.adoc +++ b/modules/rest-api/partials/rest-rebalance-table.adoc @@ -42,4 +42,12 @@ | `/internalSettings` | xref:rest-api:rest-cluster-disable-query.adoc[Disabling Consistent View Query Results on Rebalance] +| `GET` +| `/internalSettings` +| xref:rest-api:file-based-data-rebalance.adoc#get-settings[Get File-Based Rebalance (FBR) Settings] + +| `POST` +| `/internalSettings` +| xref:rest-api:file-based-data-rebalance.adoc#set-settings[Configure FBR] + |=== diff --git a/modules/rest-api/partials/user_pwd_host_port_params.adoc b/modules/rest-api/partials/user_pwd_host_port_params.adoc index 6f52d5f4eb..302ebabaac 100644 --- a/modules/rest-api/partials/user_pwd_host_port_params.adoc +++ b/modules/rest-api/partials/user_pwd_host_port_params.adoc @@ -1,14 +1,14 @@ // Be sure to set the 'required-privileges' attribute before including this partial. // It must be the anchor to the Required Privileges section for the specific endpoint. -`host`:: -Hostname or IP address of a Couchbase Server. +`HOST`:: +Hostname or IP address of a Couchbase Server node. -`port`:: +`PORT`:: Port number for the REST API. Defaults are 8091 for unencrypted and 18901 for encrypted connections. -`$USER`:: +`USER`:: The name of a user who has at least 1 of the roles listed in xref:{required-privileges}[Required Privileges]. -`$PASSWORD`:: +`PASSWORD`:: The password for the user.