At SQLUndercover we decided it would be a good idea to not only upgrade our test lab but to treat it as if it were a live system carrying out an in place upgrade with minimal downtime, here’s how we did it:
Here is our very basic setup for this scenario, we have two availability groups ‘AG1’ and ‘AG2’, for the purposes of this blog post we will focus our attention on just AG1.
As you can see we have two SQL replicas ‘SQL01’ and ‘SQL02’, we will be starting this scenario with SQL01 acting as the primary replica, so let’s get started.
Now would be a good time to take any Full backups that may be missing or inaccessible
Before you start be sure to check you have sufficient log drive space for potential log file growth in case the SQL Server installations take longer than expected.
Don’t have an easy way to check for the last FULL Backups for each database? a quick query on Msdb should help, something like this should do the trick:
SELECT [Database_name], MAX(Backup_Start_Date) FROM Msdb..BackupSet WHERE [Type] = 'D' GROUP BY [Database_name]
Alternatively if you do not have many databases to check and you are able to pass the database names in as a comma delimited string you could use sp_RestoreScript from our UNDERCOVER TOOLBOX against its will (just don’t tell David) by using it to generate restore statements that will show you the latest FULL backup file taken for each database in the comma delimited list like this:
EXEC sp_RestoreScript @Databasename = 'SQLUndercover,SomeOtherDB', @RestoreOptions = 'ToFull'
1. Let’s get the secondary replica (SQL02) switched from Synchronous mode to Asynchronous mode, we need to ensure that this is set so that transactions on the primary can be committed without waiting for confirmation from the secondary replica, later the SQL service on the secondary will be restarted as part of the upgrade and the server itself will be rebooted.
2. Connect to the Secondary server using which ever method you use to access the machine as if it were a desktop machine , RDP/Radmin/VMWare etc , in our case we are connecting to SQL02 (you may have more than one so this will be the first secondary that you are planning to upgrade). Start the SQL Server 2017 Installer…
Under the menu on the left hand side click ‘Installation’ then choose ‘Upgrade from a previous version of SQL Server’ , click Next.
3. Select the Edition you wish to install and click next.
4. Run through the installation until you get to the ‘Select Instance’ step, Select the instance you want to upgrade and click next.
5. When you reach the ‘Instance Configuration’ step, confirm the details on-screen and click next.
6. After waiting for a short period of time you should reach the ‘Ready to Upgrade’ step, if by this point you have sweaty palms and want to back out you can but once you click ‘Upgrade’ there’s no turning back so be sure you are 100% happy with the choices you made in the previous steps.
7. This bit will take a little while… Feeling a bit parched? why not grab yourself a cuppa , you have a little time on your hands.
8. At this point in the installation the SQL Service will be restarted on the secondary server you are working on, this is where our Asynchronous switch comes into its own.
9. If you happen to be watching the Availability group Dashboard/s then you will notice that the Replica you are working on will be showing as Critical with Warnings – this is fine we can ignore these as they are planned and will not be affecting the Primary node from a service availability point of view, transactions will be happily committing between the Primary and any other Secondaries you may have, but for now the queue will be building for this server until it comes back online and back into sync.
10. Service has been restarted as part of the installation and now the installation has completed, if you are still watching the Availability group dashboard you will notice that things should have re-synced back up again with some lovely looking green ticks (these won’t stay for long unfortunately) , it’s now time to Restart that secondary – Lets bounce the box…
11. After quivering over the restart button from the start menu, double checking and triple checking I am on the correct server and wiping the sweat from my brow – I take the plunge!
12. When SQL02 comes back up you will notice that the dashboard will be showing those lovely green ticks again as things are back in Sync, a quick connection to SQL02 from management studio will reveal that the version is now 14.0.600.250.
13. Time to set the Synchronization mode back to Synchronous Commit on the Secondary in preparation for Failing the Availability group over to the Secondary Server so that we can upgrade the primary with the least impact on the front end users.
14. If you have more than one Secondary Replica then you should repeat all of the previous steps and only proceed past this step once everything is in sync and just the Primary upgrade remains.
15. Give the Availability group a quick once over to check that it’s completely ‘Synchronized’ and not showing as ‘not Synchronizing’ .
16. You should now consider taking backups of ALL databases including system databases, the method you use is up to you just be sure that if you are taking Differential or Log backups that the Full is available and that the log chain is in tact.
Don’t forget to backup, this is your only lifeline if you need to go back to 2016, once the database/s are upgraded to 2017 all subsequent backups can only be restored using SQL Server 2017.
17. once you are happy that you have contingency in place in the form of backups then you should feel confident in proceeding with the failover.
Be sure to check you Log drive space as there is a chance Logs could grow if the installation hangs or takes a long time if Logs are reaching their space allocation on busy databases.
You are about to witness Database states that you may have never seen before 🙂
18. A quick sanity check here to be absolutely certain that there will be no data loss…
19. There’s a new primary in town! SQL02…
20. For those of you who are a little squeamish look away now!
What!? In Recovery!?
Don’t panic, this is still going as planned as this is totally expected and this is the reason why:
We are now in a situation where we have the Primary server running 2017 but one (for us) or possibly more than one for you running on 2016 , its not possible for the 2016 server to synchronize as its databases have not been upgraded yet, they will therefore be stuck in recovery but we are about to fix that very soon.
21. As before connect to the SQL Server and install SQL Server 2017 using the Upgrade option, below is an in progress shot of the Availability group dashboard (top center) , SQL01 with installation in progress at the point where the service has been stopped (Bottom Left) and SQL02 (Bottom Right)
If this was actually a live system and not just our test lab I would run sp_FailedLogins and Sp_FailedJobs periodically on our new primary (SQL02) whilst the upgrade of SQL01 is in progress and also afterwards too just to be sure that we are on top of our game.
22. Once the upgrade is complete on SQL01 and the box restarted you should see the dashboard showing a warning triangle rather than the Red X, but still not in Sync yet…. Don’t panic there is a perfectly good explanation for this…
Versions are looking good!
23. If we expand the Availability Databases tree we have the answer – our Data movement was Paused so until we resume the data movement the Synchronization state will remain as ‘Not Synchronizing’, Lets get this moving…
24. A Quick check of the SQL Server Error log on the Old Primary – now secondary (SQL01) shows that the Databases has been successfully upgraded , Nice .. now we are back in business!
25. Wipe the excess sweat from you palms, breathe a sigh of relief and go make another cuppa, you deserve it but before you do just open the Availability group dashboard so that you can witness those wonderful green ticks….
26. Lets puts things back to the way they were – we set the synchronization mode back to Synchronous Commit on SQL01.
And Voila , we are fully upgraded with minimal impact to the end users – just a Failover to this point, of course if we really wanted things to be back to the way they were then we would need to fail things back over again to SQL01. Well just to be sure that everything was completely successful we did this too just to ensure that there were no doubts and this was successful however in a real life situation you may choose to leave this second failover for a maintenance window, of course this sort of work would always have a preference to be carried out during a maintenance window anyway but it’s always nice to see what can be achieved without such luxuries.
Thanks for reading 🙂