Public Access

Files

Mathieu Aumont db4f0508cb Premier commit déjà bien avancé

2025-11-10 18:33:24 +01:00

8.6 KiB

Raw Blame History

title, date, last_modified

title	date	last_modified
export.md	08-11-2025	09-11-2025:01:15

How to remplace Chord IP on a Storage node/S3C cluster.

Prech checks

Note

Note

Ring should be Green on META and DATA

S3C should be Green and Metadata correctly synced

Check the server Name in federation

cd /srv/scality/s3/s3-offline/federation/
cat env/s3config/inventory

Run a backup of the config files for all nodes

salt '*' cmd.run "scality-backup -b /var/lib/scality/backup"

Check ElasticSearch Status (from the supervisor)

curl -Ls http://localhost:4443/api/v0.1/es_proxy/_cluster/health?pretty

Check the status of Metadata S3C :

cd /srv/scality/s3/s3-offline/federation/
./ansible-playbook -i env/s3config/inventory tooling-playbooks/gather-metadata-status.yml

If you have SOFS Connectors check Zookeeper status

Set variables :

OLDIP="X.X.X.X"
NEWIP="X.X.X.X"
RING=DATA

Stop the Ring internal jobs :

From the supervisor, disable auto purge, auto join, auto_rebuild :

for RING in $(ringsh supervisor ringList); do \
   ringsh supervisor ringConfigSet ${RING} join_auto 0; \
   ringsh supervisor ringConfigSet ${RING} rebuild_auto 0; \
   ringsh supervisor ringConfigSet ${RING} chordpurge_enable 0; \
done

Leave the node from the UI or with this loop

SERVER=myservername (adapt with the correct name)

for NODE in \
  $(for RING in $(ringsh supervisor ringList); do \
     ringsh supervisor ringStatus ${RING} | \
     grep 'Node: ' | \
     grep -w ${SERVER} | \
     cut -d ' ' -f 3 ;\
     done); \
  do \
  echo  ringsh supervisor nodeLeave ${NODE/:/ } ;\
done

Stop the Storage node services :

Note

Note From the storage node

Identify the roles of the server :

salt-call grains.get roles

Stop all the services

systemctl disable --now scality-node scality-sagentd scality-srebuildd scality-sophiactl elasticsearch.service

Stop S3C :

systemctl stop s3c@*
crictl ps -a
systemctl disable containerd.service

If the node is also ROLE_PROM / ROLE_ELASTIC / ROLE_ZK :

systemctl stop prometheus

NOW CHANGE THE IP ON THE NODE :

Change the IP adress on the supervisor config files :

Note

Note From the supervisor

Check the ssh connection manually and restart salt-minion

systemctl restart salt-minion

Remove / Accept new salt minion KEY

salt-key -d $SERVER
salt-key -L
salt-key -A

Update the plateform_description.csv with the new ip
Regenerate the pillar
Replace the ip on /etc/salt/roster

Replace every instance of the OLDIP with the NEWIP in Salt Pillar config files:

#/srv/scality/bin/bootstrap -d /root/scality/myplatform.csv --only-pillar -t $SERVER
vim /srv/scality/pillar/scality-common.sls
vim /srv/scality/pillar/{{server}}.sls

salt '*' saltutil.refresh_pillar
salt '*' saltutil.sync_all refresh=True

Check

grep $OLDIP /srv/scality/pillar/*

RING : Change IP on the Scality-node config.

Note

Note From the storage node

Storage node :

Check the config file :

cat /etc/scality/node/nodes.conf

Then change the IP !

Run a 'dry-run' with -d

/usr/bin/scality-update-chord-ip -n $NEWIP -d 
/usr/bin/scality-update-chord-ip -n $NEWIP

/usr/bin/scality-update-node-ip -n $NEWIP -d
/usr/bin/scality-update-node-ip -n $NEWIP

Check the config file after the IP change :

cat /etc/scality/node/nodes.conf

Srebuildd :

Note

Note FROM THE SUPERVISOR

# Target all the storage node.
salt -G 'roles:ROLE_STORE' state.sls scality.srebuildd.configured

Check with a grep :

salt -G 'roles:ROLE_STORE' cmd.run "grep $OLDIP /etc/scality/srebuildd.conf"
salt -G 'roles:ROLE_STORE' cmd.run "grep $NEWIP /etc/scality/srebuildd.conf"

If is still there after the salt state run a sed/replace to get ride of it :

salt -G 'roles:ROLE_STORE' cmd.run 'sed -i.bak-$(date +"%Y-%m-%d") 's/${OLDIP}/${NEWIP}/' /etc/scality/srebuildd.conf'

Check :

salt -G 'roles:ROLE_STORE' cmd.run "grep $OLDIP /etc/scality/srebuildd.conf"

Restart srebuildd

salt -G 'roles:ROLE_STORE' service.restart scality-srebuildd

ElasticSearch :

Redeploy Elastic topology if the node was a ES_ROLE :

salt -G 'roles:ROLE_ELASTIC' state.sls scality.elasticsearch.advertised
salt -G 'roles:ROLE_ELASTIC' state.sls scality.elasticsearch

Sagentd :

Note

Note From the storage node

salt-call state.sls scality.sagentd.registered

Check with cat /etc/scality/sagentd.yaml

ringsh-conf check

It seems ringsh show conf uses store1 to talk to the Ring probably IP has to be changed :

ringsh show conf
ringsh supervisor serverList

Restart Scality services.

systemctl enable --now scality-node scality-sagentd scality-srebuildd

Now supervisor should be on the UI with the New IP.

If not change the IP on the storage node as explained below :

Note

Note Probably deprecated .... not to be done.

From the supervisor GUI (http:///gui), go to server and delete the server which should be red. From the same page, add a new server and enter the name + new IP. From the terminal, check that the new server appear and is online

As this point storage node is supposed to be back to the Ring with NEW IP.

A bit a bruteforce to check on other servers :

# salt '*' cmd.run "grep -rw $OLDIP /etc/"

Restart scality process

systemctl enable --now scality-node scality-sagentd scality-srebuildd elasticsearch.service

for RING in $(ringsh supervisor ringList); do echo " #### $RING ####"; ringsh supervisor ringStorage $RING; ringsh supervisor ringStatus $RING; done

ringsh supervisor nodeJoinAll DATA

for RING in $(ringsh supervisor ringList); do \
  ringsh supervisor ringConfigSet ${RING} join_auto 2; \ 
  ringsh supervisor ringConfigSet ${RING} rebuild_auto 1; \
  ringsh supervisor ringConfigSet ${RING} chordpurge_enable 1; \
done

Update SUPAPI DB

Vérifier l'UI, sinon :

grep -A3 SUP_DB /etc/scality/supapi.yaml |grep password |awk '{print $2}'
psql -U supapi
\dt

table server;
table server_ip;

UPDATE server SET management_ip = '10.98.0.8' WHERE id = 19;
UPDATE server_ip SET address = '10.98.0.8' WHERE id = 17;

ElasticSearch status :

curl -Ls http://127.0.0.1:4443/api/v0.1/es_proxy/_cluster/health?pretty

S3C : Change topology

Edit the inventory with the new IP :

cd /srv/scality/s3/s3-offline/federation
vim env/s3config/inventory

Replace the IP on group_vars/all

vim env/s3config/group_vars/all

We have to advertise first the OTHER SERVERS of the ip change.

Example we are changing the ip on md1-cluster1

We will redeploy the other servers with the new topology

cd /srv/scality/s3/s3-offline/federation
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md2-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md3-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md4-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l md5-cluster1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -l stateless2 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"
./ansible-playbook -i env/s3config/inventory run.yml -t s3,DR -lstateless1 --skip-tags "requirements,run::images,cleanup" -e "redis_ip_check=False"

Note : Not sur the tag -t s3,DR will work due to bug with the S3C version. If this is not the case we will run.yml without -t

Then when all the other servers are redeployed now redeploy S3 on the current server : (md1-cluster1)

./ansible-playbook -i env/s3config/inventory run.yml -l md1-cluster1 --skip-tags "cleanup,run::images" -e "redis_ip_check=False"

Redis on S3C :

Redis on S3C does not like ip adresse change, check his status.

Check the Redis cluster They are supposed to have all the same IP (same MASTER)

../repo/venv/bin/ansible -i env/s3config/inventory -m shell -a 'ctrctl exec redis-server redis-cli -p 16379 sentinel get-master-addr-by-name scality-s3' md[12345]-cluster1

../repo/venv/bin/ansible -i env/s3config/inventory -m shell -a 'ctrctl exec redis-server redis-cli info replication | grep -E "master_host|role"' md[12345]-cluster1

8.6 KiB Raw Blame History

How to remplace Chord IP on a Storage node/S3C cluster.

Prech checks

Stop the Ring internal jobs :

Stop the Storage node services :

Change the IP adress on the supervisor config files :

RING : Change IP on the Scality-node config.

Storage node :

Srebuildd :

ElasticSearch :

Sagentd :

ringsh-conf check

Restart scality process

Update SUPAPI DB

ElasticSearch status :

S3C : Change topology

Redis on S3C :

8.6 KiB

Raw Blame History