I’m stuck at spinning up the OSD - the container is restart-looping. I’m attempting to use a partition rather than a disk, and I wonder if that’s the problem. The docs indicate that -should- work though. The logs look like this:
So I gave up on running in Docker. Using ceph-deploy instead to spin up the cluster on bare metal got me much further. I am using the newer version of Ceph (Mimic) and I’m almost fully up and running. I’m just missing the part where the docker volume plugin gets installed - it’s just links to upstream bugs in your docs. I’m not using Atomic so I hopefully don’t suffer from the linked issues.
I’m assuming you are advising to install the rexray/rbd plugin, which I’m having a go at using now…
Yes, keen to hear how that docker volume works out!
It’s not working, but I’m pretty much uninitiated on the setup for this - before installing the plugin a ceph rbd volume(?) needs to be setup. I have gone through the Ceph docs to do this, but I’m pretty sure I’m missing something (as it isn’t working).
This is blocking me deploying some services using swarm replication, where each container wants it’s own storage, rather than a ceph replicated filesystem bind-mounted. I can just use plain old local docker volumes in the short term but that rather defeats the purpose of this exercise
OK, I got way further. My Ceph config is all good now, in terms of RBD,
The Rexray plugin is giving me problems now. It seems like when I create a volume in a compose file, it fails after the first node, because the other nodes complain the volume already exists. I might need to employ some yaml-fu to increment the volume name with a numeric suffix perhaps.
I did try a different RBD driver but that one had it’s own issues.
Nearly there. I switched plugins to wetopi/rbd as it seemed to have a more robust reputation than RexRay in the DevOps groups I’m a member of.
I gave up on the volume naming issue though - the service I want to deploy (Consul) is stateful, requires persistent data, and can scale to x nodes. Volume creation collides after the first node. I have to either statically define volumes and containers (which means it can’t scale dynamically) or do something funky with copying data around prior to service init. Not sure what else to do I ended up statically defining and throwing out the ability to scale.
Traefik is runnning happily, and RBD backed, plugin created Docker volumes are lovely. Now I’m out of the Ceph woods and back in familiar Dockery territory I’m disappointed I couldn’t get Ceph running in containers. I can’t seem to find a containerized solution for running ceph-volume. Maybe I should just roll my own.
Seffyroff, I created a set of containers for Ceph (latest version of Mimic) from scratch by reading their manuals. Maybe you can try them to see if you can run Ceph over containers.
A simple docker-compose.yml example (with all daemons, but no serious storage configured) is:
version: '3.5' services: etcd0: image: quay.io/coreos/etcd environment: - ETCD_LISTEN_CLIENT_URLS=http://0.0.0.0:2379 - ETCD_ADVERTISE_CLIENT_URLS=http://etcd0:2379 mon0: image: flaviostutz/ceph-monitor environment: - ETCD_URL=http://etcd0:2379 - PEER_MONITOR_HOST=mon1 - CREATE_CLUSTER_IF_PEER_DOWN=true mon1: image: flaviostutz/ceph-monitor environment: - ETCD_URL=http://etcd0:2379 - PEER_MONITOR_HOST=mon0 mgr1: image: flaviostutz/ceph-manager ports: - 18443:8443 #dashboard https - 18003:8003 #restful https - 19283:9283 #prometheus environment: - LOG_LEVEL=0 - PEER_MONITOR_HOST=mon0 - ETCD_URL=http://etcd0:2379 mgr2: image: flaviostutz/ceph-manager ports: - 28443:8443 #dashboard https - 28003:8003 #restful https - 29283:9283 #prometheus environment: - LOG_LEVEL=0 - PEER_MONITOR_HOST=mon0 - ETCD_URL=http://etcd0:2379 osd1: image: flaviostutz/ceph-osd environment: - PEER_MONITOR_HOST=mon0 - OSD_EXT4_SUPPORT=true - OSD_JOURNAL_SIZE=512 - ETCD_URL=http://etcd0:2379 osd2: image: flaviostutz/ceph-osd environment: - PEER_MONITOR_HOST=mon0 - OSD_EXT4_SUPPORT=true - OSD_JOURNAL_SIZE=512 - ETCD_URL=http://etcd0:2379 osd3: image: flaviostutz/ceph-osd environment: - PEER_MONITOR_HOST=mon0 - OSD_EXT4_SUPPORT=true - OSD_JOURNAL_SIZE=512 - ETCD_URL=http://etcd0:2379
Just run “docker-compose up” and (hopefully) the magic will happen…
On https://hub.docker.com/r/flaviostutz/ceph-osd/ you can see how to configure a storage (Bluestore).
@flaviostutz thanks for your reply. I actually already built several iterations of docker-compose here, of similar structure.
The problem I had was with initializing and mounting the block storage. I had to use ceph-deploy to get that working, and at that point had spent so many cycles watching failed docker containers that I carried on using ceph-deploy after the success in creating the storage to perform the rest of the deployment directly to the hosts.
I don’t doubt I could probably now spin up containers for the storage now it exists, but I have other problems with my ceph deployment that I want to fully understand before iterating over my rollout strategy.
So I have a ceph cluster running on 3 nodes, with each one contributing 200-500GB of block storage, which is used by a cephfs and rbd pool. The cephfs pool mostly gets bind mounted by swarm containers, and the rbd pool is used by other containers that need their own volumes per task.
My problem here is memory usage - the Ceph processes reduce the cluster to a crawl and quickly (<48h) consume all available RAM and fill up the swap.
Does anyone have recommendations to reduce Ceph memory usage? I already tried reducing the bluestore cache size to 256. I found some docs related to cephfs metadata cache reduction and will try that next, but I don’t have much hope. The nodes I’m working with are modest spec, but they have a minimum of 2GB RAM, and I avoid scheduling anything hefty on the most lightweight boxes.