Did you know that many network faults are preventable? They may also be able to be fixed quickly just by using basic network “hygiene” steps. Some of the most serious issues are usually the result of a few smaller issues or faults. If you fix the smaller issues, you won’t see as many of the bigger problems that may occur.
Keep in mind, the process of network management is not glamorous. You won’t be seen as a hero by handling this production problem, either. While this is true, you will have peace of mind because your network is running smoothly. Some tips to help you maintain good network hygiene include preventing de-duplication, and more can be found in this article.
Sort-Out Your Naming Standards
It doesn’t matter what standard you choose to use, just make sure the name provides useful information. For example, what type of device it is, where it is physically located, or something that makes sense related to where it is in the network.
Implement Basic Monitoring and Alerting
Make sure you have implemented the basic availability and monitoring of all the devices being used. This is considered a minimum requirement. You should add in the memory and CPU monitoring, and throughput monitoring for any key interfaces. This will typically include all switch-server, switch-switch, and WAN links.
While it would be nice to run a simple software application and have detailed reports generated for each interface, this is not something that is practical for most. The manual input and cost of doing this is usually too much and too high. The key is to do what you can. Usually, you won’t have to implement advanced correlation capabilities. Make sure to use your common sense and get a basic understanding of your network. If you collect historical data, it will help quite a bit when you are trying to handle issues related to a slow-moving network.
Automate All Backups
You can get a tool to handle this for you. It does not have to be anything fancy or expensive, but you need to make sure you invest in regular backups for everything. Keep in mind, this is not just to recover from a device failure (but it does help with this, too). It will also let you see what is changing within your network.
If you don’t have the funds to do this, there are some free options. If you have funds to use, there are even better tools. Be sure to research the options to find out what automation option will be best for you. After you have the configurations in place, you can begin to analyze them. For example, check to make sure the SNMP settings are correct.
Network Time Protocol (NTP)
This is available at no cost but will provide you with a lot. When you enable this, be sure it is set for consistent time zones across all your devices. There are some businesses that want to keep everything in UTC, but this is up to you. You may find it more convenient to use your local time zone. It doesn’t matter what you choose, but you do need to stay consistent. When you are trying to figure out what events occurred, you can use this and it will make things much easier.
Traps and Logs
Make sure that all the devices connected to the network are sending traps and syslogs to a central location. Once they have been received, be sure to look at them. This information can be quite interesting.
For example, think about what it means if you receive the STP Topology Change Trap. This means there is either a problem with an inter-switch link that is flapping or that there was a problem with port configuration. In either case, you have a faulty link, or your users are frustrated with other things that are working. With this information, you can ensure the proper fix for the problem, too, rather than trying to fix things blindly.
There are a few strategies you can use for the log review process. These include:
- Filtering out anything that is known or valid
- Conducting this filtering process several times to eliminate the “noise”
- Generating a list of “unknown” log entries
These “unknowns” are the ones you need to investigate further. Also, look for common entry patterns and entry sources. These are going to be an indication that something is abnormal or that something is able to be filtered at the source.
Network Management is an Art and a Science
Network management can be complicated but is necessary to provide you with seamless operation and a superior level of uptime. Be sure to keep the information here in mind when you implement network management processes. If necessary, reach out to a third-party service provider for assistance. They will be able to help you with network management and ensure you get the solutions desired.
About the Author: Rick Delgado
Rick Delgado is a business technology consultant for several Fortune 500 companies. He is also a frequent contributor to news outlets such as Wired, Tech Page One, and Cloud Tweaks. Rick enjoys writing about the intersection of business and new innovative technologies.