Servers are the most important part of an organization’s infrastructure. Its regular and proper maintenance is a core responsibility of the system administrators. Here is an exclusive blog on Ultimate Server Maintenance Checklist, which talks about several checks and measures that an administrator must regularly perform so that the server is always up and running without any hiccups.
Let’s get started!
1. Backup Your Data
In case your data on production gets corrupted, having a backup of it very critical. Losing data on production with no back up of it can lead to some serious business loss. Depending on the amount of data your organization generates, you must schedule a backup of your data on a weekly or daily basis. After scheduling, this must be tested regularly if the backups are having properly at the location configured by you. You can also run some test recoveries regularly to verify the backup status.
2. Check and Update Software Applications
Most of the software updates include additional security patches to previous issues. Also, web applications are comparatively more prone to cyber-attacks because of security breaches on the server, you should always update these applications. Most of the OS have package managers which take care of automatically updating the software present on the system. You must configure and schedule these updates regularly.
3. Check OS Updates
Server operating system getting updated is a major change. Attackers often find opportunities in new releases when the issues in new updates are still vulnerable. You should always test your system after an OS update and check if all the applications are working fine. Scheduling server OS updates weekly is recommended by professionals.
4. Check Server Resource Usage
The processor and memory usage of the server can show the utilization of the server. In case there is a spike in business growth, your server resources should be able to scale. If your server is utilizing 80% of the resources, you should plan to scale it up in advance. Hence, you should check the resource usage of the server regularly. Windows and Linux servers have inbuilt tools for tracking resource utilization.
5. Check Network Usage
Just like resource (CPU and memory) capacity, servers also have network capacity. You need to check if all your network capacity is getting utilized, you will have to upgrade it with additional capacity. Monitoring network usage with some tools will help you solve multiple problems with your server’s network.
6. Check Storage Usage
You need to check the storage capacity on the servers regularly. If it is getting utilized until 90%, you either need to reduce the storage by old logs, outdated or unused software, or add more storage space. It is recommended to always have 20-30% of free disk storage available all the time.
7. Check Remote Management Tools
If you have a cloud-based server environment or you are managing your serves remotely, you would be using remote management tools like remote reboot, rescue mode and remote console. Always do regular check that these tools are functioning as they are supposed to because all the server remote management tasks are dependent on them.
8. Check RAID Alarm
Most production servers tend to use RAID because of its reliability. There are very rare occasions when RAID will fail. There are a few software that come with RAID controllers for checking the status and setting up alarms in case any disc stops working. The server administrator must have a close eye on such alerts daily.
9. Update the Control Panel
Always updated the control panel you use for managing your servers. CPanel is one of the most popular control panels used in the industry. Just updating the control panel is not enough, you manually need to update the software present inside the control panel like Apache and PHP versions.
10. Perform Server Malware Scan
You should regularly perform a detailed malware scan on your servers to find any virus, malware or infected files. A detailed scan will put a good load on the system, so it is suggested to schedule such tasks at off-peak hours, maybe midnight timings.
11. Evaluate User Accounts
There can be huge security and legal risk if your ex-employees who have left the organization or ex-clients who are not working with you anymore still have access to your servers. People with wrong intentions can do some serious damage to the organization’s reputation and it can lead to financial loss. That’s why it is suggested to regularly check the accounts which are having access to the servers.
12. Change Passwords
It is recommended to change the server passwords every 6 months at least, you can do this monthly also. Many a time, you would share the password with your colleague in the same team for maintenance work, there can be a security risk to this. Also, rather than creating passwords manually, you should use good password generator tools for setting up strong passwords for the servers.
13. Check Server Logs Regularly
Logs are generated and maintained for all the servers, which has all the information on everything happening on it. Checking these server logs will help you identify common errors occurring most of the time and you can fix them permanently. You will also be able to identify the unauthorized access details on the servers in these logs.
14. Clean the Hardware Physically
Not all the issues on the servers will occur because of a bug, it can be the cleanliness of servers also. If this is not done after a certain gap can lead to hardware failure. You should physically inspect where the servers are placed and clean all the equipment. This will keep the hardware equipment dust free and there would be fewer chances of a physical failure.
15. Check for Hardware Errors
Always check for any error event in logs which maintains hardware details. These logs will have details related to disc failure, network cable issues or overheating problems etc. If you are using high-quality hardware, the chances of hardware errors are less.
That was all about the server maintenance checklist, I hope it was useful. If you are new to this and just getting started, start using this checklist from the beginning. If you are already maintaining servers, run it through the checklist mentioned above, this will help you in identifying a lot of issues and fixing them quickly.