8.6 KiB
title, date, last_modified
| title | date | last_modified |
|---|---|---|
| export.md | 08-11-2025 | 09-11-2025:01:15 |
How to remplace Chord IP on a Storage node/S3C cluster.
Prech checks
Note
Note
- Ring should be Green on META and DATA
- S3C should be Green and Metadata correctly synced
- Check the server Name in federation
cd /srv/scality/s3/s3-offline/federation/
cat env/s3config/inventory
- Run a backup of the config files for all nodes
salt '*' cmd.run "scality-backup -b /var/lib/scality/backup"
- Check ElasticSearch Status (from the supervisor)
curl -Ls http://localhost:4443/api/v0.1/es_proxy/_cluster/health?pretty
- Check the status of Metadata S3C :
cd /srv/scality/s3/s3-offline/federation/
./ansible-playbook -i env/s3config/inventory tooling-playbooks/gather-metadata-status.yml
If you have SOFS Connectors check Zookeeper status
- Set variables :
OLDIP="X.X.X.X"
NEWIP="X.X.X.X"
RING=DATA
Stop the Ring internal jobs :
- From the supervisor, disable auto purge, auto join, auto_rebuild :
for RING in $(ringsh supervisor ringList); do \
ringsh supervisor ringConfigSet ${RING} join_auto 0; \
ringsh supervisor ringConfigSet ${RING} rebuild_auto 0; \
ringsh supervisor ringConfigSet ${RING} chordpurge_enable 0; \
done
Leave the node from the UIor with this loop
SERVER=myservername (adapt with the correct name)
for NODE in \
$(for RING in $(ringsh supervisor ringList); do \
ringsh supervisor ringStatus ${RING} | \
grep 'Node: ' | \
grep -w ${SERVER} | \
cut -d ' ' -f 3 ;\
done); \
do \
echo ringsh supervisor nodeLeave ${NODE/:/ } ;\
done
Stop the Storage node services :
Note
Note From the storage node
- Identify the roles of the server :
salt-call grains.get roles
Stop all the services
systemctl disable --now scality-node scality-sagentd scality-srebuildd scality-sophiactl elasticsearch.service
Stop S3C :
systemctl stop s3c@*
crictl ps -a
systemctl disable containerd.service
If the node is also ROLE_PROM / ROLE_ELASTIC / ROLE_ZK :
systemctl stop prometheus
NOW CHANGE THE IP ON THE NODE :
Change the IP adress on the supervisor config files :
Note
Note From the supervisor
- Check the ssh connection manually and restart salt-minion
systemctl restart salt-minion
Remove / Accept new salt minion KEY
salt-key -d $SERVER
salt-key -L
salt-key -A
- Update the
plateform_description.csvwith the new ip - Regenerate the pillar
- Replace the ip on
/etc/salt/roster
Replace every instance of the OLDIP with the NEWIP in Salt Pillar config files:
#/srv/scality/bin/bootstrap -d /root/scality/myplatform.csv --only-pillar -t $SERVER
vim /srv/scality/pillar/scality-common.sls
vim /srv/scality/pillar/{{server}}.sls
salt '*' saltutil.refresh_pillar
salt '*' saltutil.sync_all refresh=True
- Check
grep $OLDIP /srv/scality/pillar/*
RING : Change IP on the Scality-node config.
Note
Note From the storage node
Storage node :
- Check the config file :
cat /etc/scality/node/nodes.conf
Then change the IP !
Run a 'dry-run' with -d
/usr/bin/scality-update-chord-ip -n $NEWIP -d
/usr/bin/scality-update-chord-ip -n $NEWIP
/usr/bin/scality-update-node-ip -n $NEWIP -d
/usr/bin/scality-update-node-ip -n $NEWIP
- Check the config file after the IP change :
cat /etc/scality/node/nodes.conf
Srebuildd :
Note
Note FROM THE SUPERVISOR
# Target all the storage node.
salt -G 'roles:ROLE_STORE' state.sls scality.srebuildd.configured
Check with a grep :
salt -G 'roles:ROLE_STORE' cmd.run "grep $OLDIP /etc/scality/srebuildd.conf"
salt -G 'roles:ROLE_STORE' cmd.run "grep $NEWIP /etc/scality/srebuildd.conf"
If is still there after the salt state run a sed/replace to get ride of it :
salt -G 'roles:ROLE_STORE' cmd.run 'sed -i.bak-$(date +"%Y-%m-%d") 's/${OLDIP}/${NEWIP}/' /etc/scality/srebuildd.conf'
Check :
salt -G 'roles:ROLE_STORE' cmd.run "grep $OLDIP /etc/scality/srebuildd.conf"
Restart srebuildd
salt -G 'roles:ROLE_STORE' service.restart scality-srebuildd
ElasticSearch :
Redeploy Elastic topology if the node was a ES_ROLE :
salt -G 'roles:ROLE_ELASTIC' state.sls scality.elasticsearch.advertised
salt -G 'roles:ROLE_ELASTIC' state.sls scality.elasticsearch
Sagentd :
Note
Note From the storage node
salt-call state.sls scality.sagentd.registered
- Check with
cat /etc/scality/sagentd.yaml
ringsh-conf check
It seems ringsh show conf uses store1 to talk to the Ring probably IP has to be changed :
ringsh show conf
ringsh supervisor serverList
Restart Scality services.
systemctl enable --now scality-node scality-sagentd scality-srebuildd
Now supervisor should be on the UI with the New IP.
If not change the IP on the storage node as explained below :
Note
Note Probably deprecated .... not to be done.
From the supervisor GUI (http:///gui), go to server and delete the server which should be red. From the same page, add a new server and enter the name + new IP. From the terminal, check that the new server appear and is online
As this point storage node is supposed to be back to the Ring with NEW IP.
A bit a bruteforce to check on other servers :
# salt '*' cmd.run "grep -rw $OLDIP /etc/"
Restart scality process
systemctl enable --now scality-node scality-sagentd scality-srebuildd elasticsearch.service
for RING in $(ringsh supervisor ringList); do echo " #### $RING ####"; ringsh supervisor ringStorage $RING; ringsh supervisor ringStatus $RING; done
ringsh supervisor nodeJoinAll DATA
for RING in $(ringsh supervisor ringList); do \
ringsh supervisor ringConfigSet ${RING} join_auto 2; \
ringsh supervisor ringConfigSet ${RING} rebuild_auto 1; \
ringsh supervisor ringConfigSet ${RING} chordpurge_enable 1; \
done
Update SUPAPI DB
Vérifier l'UI, sinon :
grep -A3 SUP_DB /etc/scality/supapi.yaml |grep password |awk '{print $2}'
psql -U supapi
\dt
table server;
table server_ip;
UPDATE server SET management_ip = '10.98.0.8' WHERE id = 19;
UPDATE server_ip SET address = '10.98.0.8' WHERE id = 17;
ElasticSearch status :
curl -Ls http://127.0.0.1:4443/api/v0.1/es_proxy/_cluster/health?pretty
S3C : Change topology
- Edit the inventory with the new IP :
cd /srv/scality/s3/s3-offline/federation
vim env/s3config/inventory
- Replace the IP on
group_vars/all
vim env/s3config/group_vars/all
We have to advertise first the OTHER SERVERS of the ip change.
Example we are changing the ip on md1-cluster1
We will redeploy the other servers with the new topology
cd /srv/scality/s3/s3-offline/federation
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md2-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md3-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md4-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md5-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l stateless2 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -lstateless1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
Note : Not sur the tag -t s3,DR will work due to bug with the S3C version.
If this is not the case we will run.yml without -t
Then when all the other servers are redeployed now redeploy S3 on the current server : (md1-cluster1)
./ansible-playbook -i env/s3config/inventory run.yml -l md1-cluster1 --skip-tags "cleanup,run::images" -e "redis_ip_check=False"
Redis on S3C :
Redis on S3C does not like ip adresse change, check his status.
Check the Redis cluster They are supposed to have all the same IP (same MASTER)
../repo/venv/bin/ansible -i env/s3config/inventory -m shell -a 'ctrctl exec redis-server redis-cli -p 16379 sentinel get-master-addr-by-name scality-s3' md[12345]-cluster1
../repo/venv/bin/ansible -i env/s3config/inventory -m shell -a 'ctrctl exec redis-server redis-cli info replication | grep -E "master_host|role"' md[12345]-cluster1