⚓ T365998 Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad

Status	Assigned	Task
Open	cmooney	T348977 Upgrade EVPN switches Eqiad row E-F to JunOS 22.2
Open	cmooney	T365998 Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad
Resolved	• Marostegui	T368374 Move one host temporarily to m2
Resolved	• Marostegui	T368494 Switchover m2 master db1195 -> db1228

ABran-WMF created this task.May 27 2024, 1:22 PM

ABran-WMF updated Other Assignee, added: MatthewVernon.

ABran-WMF mentioned this in T348977: Upgrade EVPN switches Eqiad row E-F to JunOS 22.2.

ABran-WMF added projects: Data-Persistence, DBA.May 27 2024, 3:03 PM

ABran-WMF moved this task from Triage to Ready on the DBA board.May 28 2024, 7:48 AM

ABran-WMF added a project: SRE-swift-storage.May 28 2024, 10:24 AM

swift-wise, just need to check the cluster's happy afterwards.

moss-be1003 is part of the apus Ceph cluster, which should be in production by end of this quarter (i.e. before this work is due to happen), and will need a bit of care. Should just be a case of putting it into maintenance mode beforehand, but it's 1/3 of the cluster capacity.

cmooney claimed this task.Jun 5 2024, 7:16 PM

cmooney triaged this task as Medium priority.

cmooney updated the task description. (Show Details)

db1205 is the secondary media backups metadata db server, usually just a standby to db1204. Unless it is the active server because the primary is unavailable, it just has to be checked that replication restarts correctly after maintenance.

ABran-WMF updated the task description. (Show Details)Tue, Jun 25, 9:23 AM

• Marostegui closed subtask T368374: Move one host temporarily to m2 as Resolved.Wed, Jun 26, 4:21 AM

• Marostegui updated the task description. (Show Details)Mon, Jul 1, 5:01 AM

• Marostegui updated the task description. (Show Details)Fri, Jul 5, 5:04 AM

ayounsi moved this task from Backlog to This quarter on the netops board.Mon, Jul 8, 7:24 AM

cmooney updated the task description. (Show Details)Mon, Jul 8, 4:07 PM

Icinga downtime and Alertmanager silence (ID=6a298ae5-e736-4051-8220-9ec4f352950a) set by cmooney@cumin1002 for 0:40:00 on 1 host(s) and their services with reason: prep JunOS upgrade lsw1-e3-eqiad

lsw1-e3-eqiad.mgmt

Icinga downtime and Alertmanager silence (ID=39fcbcd0-8c16-4208-ac06-f4b442e55a54) set by cmooney@cumin1002 for 0:30:00 on 4 host(s) and their services with reason: JunOS upgrade lsw1-e3-eqiad

lsw1-e3-eqiad,lsw1-e3-eqiad IPv6,ssw1-e1-eqiad.mgmt,ssw1-f1-eqiad.mgmt

Icinga downtime and Alertmanager silence (ID=2a5cb43e-793c-4103-9499-369354315479) set by cmooney@cumin1002 for 0:40:00 on 27 host(s) and their services with reason: JunOS upgrade lsw1-e3-eqiad

an-presto1010.eqiad.wmnet,an-worker1154.eqiad.wmnet,backup1009.eqiad.wmnet,cephosd1003.eqiad.wmnet,db[1192,1198-1199,1204].eqiad.wmnet,druid1010.eqiad.wmnet,dse-k8s-worker1006.eqiad.wmnet,elastic[1093-1095].eqiad.wmnet,kafka-jumbo1012.eqiad.wmnet,kafka-stretch1001.eqiad.wmnet,kubernetes[1047-1051,1061].eqiad.wmnet,ml-serve1006.eqiad.wmnet,ms-be1074.eqiad.wmnet,mw[1491-1493].eqiad.wmnet,wdqs1015.eqiad.wmnet

Mentioned in SAL (#wikimedia-operations) [2024-07-09T15:04:20Z] <topranks> rebooting lsw1-e3-eqiad to install updated JunOS version T365998

Clement_Goubert subscribed.Thu, Jul 18, 2:24 PM

Mentioned in SAL (#wikimedia-operations) [2024-07-18T14:47:54Z] <arnaudb@cumin1002> dbctl commit (dc=all): 'T365998 - depooling db1195 - s1 db1202 - s7 db1203 - s8', diff saved to https://phabricator.wikimedia.org/P66816 and previous config saved to /var/cache/conftool/dbconfig/20240718-144754-arnaudb.json

data-persistence hosts handled, ready whenever you are @cmooney

Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad
Open, MediumPublic
Actions

Description

Details

Related Objects
Search...

Event Timeline

Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad Open, MediumPublicActions

Description

Details

Related ObjectsSearch...

Event Timeline

Upgrade EVPN switches Eqiad row E-F to JunOS 22.2 -lsw1-f3-eqiad
Open, MediumPublic
Actions

Related Objects
Search...