Cisco CCNP TSHOOT Planning Processes for Troubleshooting Complex Enterprise Networks

ccnp-tshoot-complex-01

 

 

Cisco CCNP TSHOOT Troubleshooting – What is it?

ccnp-tshoot-complex-02

There are three main steps in troubleshooting:

Identify the problem – The first part of troubleshooting deals with identifying what problem you are trying to solve.  Is it a connectivity problem, a network slowdown, route flapping, etc.  Having a baseline for expected response time, route paths, etc is key to knowing if the problem you are trying to identify is a real problem or a perceived problem by users. 

Diagnose the problem – Once the problem is identified, it is time to gather information, analyze the information gathered and come up with a proposed solution.  

Solve the problem – Finally implement the proposed solution and test if it worked.  If it did, problem solved.  If not, go back to step 2 and gather more information then analyze an propose a different solution. 


Cisco CCNP TSHOOT Diagnosis Fundamentals

ccnp-tshoot-complex-03

Diagnosing a problem has many key steps.  It is the most time consuming of the troubleshooting process.


Cisco CCNP TSHOOT Troubleshooting Methodology

ccnp-tshoot-complex-04

Having a structured approach to troubleshooting increases your chances of resolving problems in a timely manner.  Based on the figure on the slide you first define the problem then start gathering information.  You then analyze data gathered and eliminate variables.  From there you propose a hypothesis then test the hypothesis.  The will more than likely be an iterative process until you ultimately determine a solution.  Documenting findings along the way is key to having a successful resolution. 


Cisco CCNP TSHOOT Structured versus Unstructured Troubleshooting

ccnp-tshoot-complex-05

In general, an unstructured approach to troubleshooting decreases your chances of resolving problems in a timely manner.  It is basically a “Shoot from the Hip” approach where you are trying things without gathering all the data necessary to really determine the problem.  You might get lucky, but in general a Structured approach to troubleshooting is preferred over an Unstructured approach.


Cisco CCNP TSHOOT Bottom Up Troubleshooting

ccnp-tshoot-complex-06

The bottom-up troubleshooting method starts at Layer 1 (Physical) of the OSI Reference Model an works its way up to Layer 7 (Application).  Working in this direction, you can eliminate potential problem causes and narrow the scope of the potential problems.

The main benefit of this method is that the initial troubleshooting takes place in areas you control.  Access to application software is not required until the late stages of troubleshooting and possibly not at all.

The main disadvantage of this method is that in large networks it is very time-consuming gathering and analyzing data.

Hence, when using this troubleshooting method, it is best to first reduce the problem scope using a different strategy then utilize a bottom-up approach when the scope has been reduced.


Cisco CCNP TSHOOT Top Down Troubleshooting

ccnp-tshoot-complex-07

The top-down troubleshooting method starts at Layer 7 (Application) of the OSI Reference Model an works its way down to Layer 1 (Physical).  Since each layer of the OSI Reference Model is dependant on the layers below it, you can safely assume that if the layer you are troubleshooting works, then the underlying layers are working as well.

Therefore, the goal of this troubleshooting method is to find the highest OSI layer that is functioning properly.  Once found, all processes at that layer and below can be eliminated as potential problems.

The Top Down Troubleshooting method is one of the most straightforward troubleshooting methods.  One drawback to this method is that you must have access to application layer software to initiate the troubleshooting process.  This may or may not be the case.


Cisco CCNP TSHOOT Happy Medium – Divide and Conquer

ccnp-tshoot-complex-08

A happy medium to the troubleshooting method is divide-and-conquer.  This troubleshooting method is a balance between the top-down and bottom-up troubleshooting methods.  The divide-and-conquer method starts in the middle of the OSI model (typically at the network layer, e.g. ping test).  If the testing is successful, just as with the top-down method, you can assume that all lower layers are good.  From there you can convert to bottom-up troubleshooting from this point.  If the testing is unsuccessful, you can convert to top-down troubleshooting from this point.

This method typically results in faster elimination of potential problems.


Cisco CCNP TSHOOT Tracing Network Path

ccnp-tshoot-complex-09


Cisco CCNP TSHOOT Comparing Like Devices

ccnp-tshoot-complex-10

Note: The output of the show ip route command on Router2 does not have a default route.  That might be the problem since Router1 and Router2 are branch routers that are supported to be similarly configured but with different IP Addresses.


Cisco CCNP TSHOOT Test the Problem from a Different Device

ccnp-tshoot-complex-11

Swapping connections is a common troubleshooting technique.  It can eliminate hardware (e.g. cables, routers/switches and hosts) as being the problem or in cases where hardware is the problem, it can point to the specific device as the problem.


Cisco CCNP TSHOOT The Troubleshooting Process

ccnp-tshoot-complex-12

Troubleshooting is an art.  Each troubleshooter is an artist with a blank sheet of canvas.  How you paint the canvas determines how good a troubleshooter you will become.  As you get more troubleshooting experience you will get better and increase the tools at your disposal that you use when troubleshooting (e.g. Wireshark, Configuration Management Informaiton, etc).


Cisco CCNP TSHOOT Problem Definition

ccnp-tshoot-complex-13

Problems can be reported in numerous different ways.  They can be reported via a formal ticketing process, defined in an email or told in person just to name a few.  It is important to define the problem clearly as to not was time based on a lot of false assumptions.  A good problem description consists of an accurate description of the systems, not interpretations or conclusions.  The more specific a problem description is, the better.  An accurate description of the problem allows the troubleshooter to more readily start gathering information for analysis.


Cisco CCNP TSHOOT Information Gathering

ccnp-tshoot-complex-14

After the problem is clearly defined, it is time to start gathering data.  Here you need to identify what the targets are for the information-gathering process.  For example, from which network devices, clients, or servers do you want to gather information? Which tools do you plan on using to gather that information?  Once enough information is gathered, it is time to start analyzing the data.


Cisco CCNP TSHOOT Analyzing the Facts

ccnp-tshoot-complex-15

Analyzing (or interpreting) the data gathered is the next step in the troubleshooting process.  It is similar to detective work.  You need to analyze the facts to come up with a hypothesis of what is going on.  As part of the process you will eliminate variables.  Note: This is typically an iterative process.


Cisco CCNP TSHOOT Elimination of Possible Causes

ccnp-tshoot-complex-16

Once the information is gathered and analyzed you can start drawing conclusions, eliminate possible causes start testing potential solutions (propose hypothesis.


Cisco CCNP TSHOOT Example of Eliminating a Possible Cause

ccnp-tshoot-complex-17

Start at the source of the problem and try eliminating possible causes while slowly working your way towards the destination.


Cisco CCNP TSHOOT Utilizing a Hypothesis to Aid in Troubleshooting

ccnp-tshoot-complex-18

After you have proposed and eliminated potential problem causes, you should still have some other potential causes (preferably just one) that have not been eliminated.  Using that one potential cause or in the case that there is more than one, choose the one you feel is most likely the cause then use that as your problem hypothesis.  From there determine if the problem is your responsibility.  If it is, attempt to solve, otherwise you will need to escalate the problem to the responsible party.


Cisco CCNP TSHOOT Test Hypothesis

ccnp-tshoot-complex-19

Once a hypothesis is formulated as to the cause of the problem, the next step is to come up with a potential solution and test that potential solution.

Prior to testing you need to determine the impact of the change on the network and weigh that against the urgency of the problem.  You might decide that the problem is not that critical, but the change required to test a potential solution would cause the network to be temporarily unavailable.  In that case you might want to wait to test during off hours or coordinate with the uses as to the best time to test.


Cisco CCNP TSHOOT Problem Resolution

ccnp-tshoot-complex-20

 

 

The problem is solved only after you have confirmed your hypothesis by testing your potential solution and the symptoms of the problem have disappeared.  Remember to backup your configurations as once everything is working. This is critical to your CM process.  Also, ensure that any trouble / incident ticket that was written is fully documented and that the users are alerted the problem has been resolved.


Cisco CCNP TSHOOT Chapter 3 Summary

ccnp-tshoot-complex-21

 

Cisco CCNP TSHOOT

ccnp-tshoot-complex-12