Metadata-Version: 2.4
Name: couchdb-cluster-admin
Version: 0.7.4.dev20250930141220
Summary: Utility for managing multi-node couchdb clusters
Author-email: Dimagi <dev@dimagi.com>
License: BSD 3-Clause License
        
        Copyright (c) 2017, Dimagi
        All rights reserved.
        
        Redistribution and use in source and binary forms, with or without
        modification, are permitted provided that the following conditions are met:
        
        * Redistributions of source code must retain the above copyright notice, this
          list of conditions and the following disclaimer.
        
        * Redistributions in binary form must reproduce the above copyright notice,
          this list of conditions and the following disclaimer in the documentation
          and/or other materials provided with the distribution.
        
        * Neither the name of the copyright holder nor the names of its
          contributors may be used to endorse or promote products derived from
          this software without specific prior written permission.
        
        THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
        AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
        IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
        DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
        FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
        DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
        SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
        CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
        OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
        OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
        
Project-URL: Home, https://github.com/dimagi/couchdb-cluster-admin
Classifier: Development Status :: 3 - Alpha
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: BSD License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: argparse>=1.4
Requires-Dist: dimagi-memoized
Requires-Dist: gevent
Requires-Dist: jsonobject
Requires-Dist: PyYAML
Requires-Dist: requests
Dynamic: license-file

# couchdb-cluster-admin
utility for managing multi-node couchdb 2.x clusters

# First, put together a config file for your setup

This will make the rest of the commands simpler to run. Copy the example

```
cp config/conf.example.yml config/mycluster.yml
```

and then edit it with the details of your cluster.

# Setting up a local cluster to test on

If you have docker installed you can just run

```bash
docker build -t couchdb-cluster - < docker-couchdb-cluster/Dockerfile
```

to build the cluster image (based on klaemo/couchdb:2.0-dev) and then run

```bash
docker run --name couchdb-cluster \
  -p 15984:15984 \
  -p 15986:15986 \
  -p 25984:25984 \
  -p 25986:25986 \
  -p 35984:35984 \
  -p 35986:35986 \
  -p 45984:45984 \
  -p 45986:45986 \
  -v $(pwd)/data:/usr/src/couchdb/dev/lib/ \
  -t couchdb-cluster \
  --with-admin-party-please \
  -n 4
```

to start a cluster with 4 nodes. The nodes' data will be persisted to `./data`.

To run the tests (which require this docker setup), download and install https://github.com/sstephenson/bats

```bash
git clone https://github.com/sstephenson/bats.git
cd bats
./install.sh /usr/local  # or wherever on your PATH you want to install this
```

and then

```bash
docker start couchdb-cluster  # make sure this is running and localhost:15984 is receiving pings
bats test/
```

# Optional: Set password in environment

If you do not wish to specify your password every time you run a command,
you may put its value in the `COUCHDB_CLUSTER_ADMIN_PASSWORD` environment variable like so:

```
read -sp Password: PW
```

Then, for all commands below prefex the command with `COUCHDB_CLUSTER_ADMIN_PASSWORD=$PW`, e.g.

```
COUCHDB_CLUSTER_ADMIN_PASSWORD=$PW python couchdb-admin-cluster/describe.py --conf mycluster.yml
```

# Get a quick overview of your cluster

Now you can run

```
python couchdb_cluster_admin/describe.py --conf config/mycluster.yml
```

to see an overview of your cluster nodes and shard allocation.
For example, in the following output:

```
Membership
	cluster_nodes:	couch3	couch1	couch4	couch2
	all_nodes:	couch3	couch1	couch4	couch2
Shards
	                   00000000-1fffffff  20000000-3fffffff  40000000-5fffffff  60000000-7fffffff  80000000-9fffffff  a0000000-bfffffff  c0000000-dfffffff  e0000000-ffffffff
	mydb                    couch1             couch1             couch1             couch1             couch1             couch1             couch1             couch1
	my_second_database      couch1             couch1             couch1             couch1             couch1             couch1             couch1             couch1
```

you can see that while there are four nodes,
all shards are currently assigned only to the first node.

# Help estimating shard allocation

In order to plan out a shard reallocation, you can run the following command:

```bash
python couchdb_cluster_admin/suggest_shard_allocation.py --conf config/mycluster.yml --allocate couch1:1 couch2,couch3,couch4:2
```

The values for the `--allocate` arg in the example above should be interpreted as
"Put 1 copy on couch1, and put 2 copies spread across couch2, couch3, and couch4".

The output looks like this:

```
couch1	57.57 GB
couch2	42.15 GB
couch3	36.5 GB
couch4	36.5 GB
                     00000000-1fffffff     20000000-3fffffff     40000000-5fffffff     60000000-7fffffff     80000000-9fffffff     a0000000-bfffffff     c0000000-dfffffff     e0000000-ffffffff
mydb                couch1,couch2,couch4  couch1,couch2,couch3  couch1,couch3,couch4  couch1,couch2,couch4  couch1,couch2,couch3  couch1,couch3,couch4  couch1,couch2,couch4  couch1,couch2,couch3
my_second_database  couch1,couch3,couch4  couch1,couch3,couch4  couch1,couch3,couch4  couch1,couch3,couch4  couch1,couch3,couch4  couch1,couch3,couch4  couch1,couch3,couch4  couch1,couch3,couch4
```

Note, the reallocation does not take into account the current location of shards,
so it is much more useful in the situation that you're moving from a single-node cluster
to a multi-node cluster than it is in the situation where you're adding one more node to a multi-node cluster.
In the example above, couch1 would be the single-node cluster and couch2, couch3, and couch4
form are the multi-node cluster–to-be. You can imagine that after implementing
the shard allocation suggested here, we might remove all shards from couch1 and remove it from the cluster.

Note also that there is no guarantee that the "same" shard of different databases will go to the same node;
each (db, shard)-pair is treated as an independent unit when making computing an even shard allocation.
In this example there are only a few dbs and shards; when shards * dbs is high,
this process can be quite good at evenly balancing your data across nodes.
