Troubleshooting and troubleshooting methods are one of the most important skills a person can develop in their career. The goal of troubleshooting is to determine why something does not work as expected and find a solution to the problem at hand. It's often an overlooked or misunderstood skill and can result in hours of head scratching trauma if not done correctly.
Here's a step by step process to go by when troubleshooting that will help ease the pain of finding any problem's solution.
Step #1 - Gather information
When did the problem start?
What are the symptoms?
Is it reproducible?
Sometimes a symptom of a problem will have more than one cause, so you need to have as much of the information available to you up font as possible. You might just be able to fix the problem just by knowing what happened before the symptoms started.
Real life example
Customer states
We’re having an issue with a D-2060 Mod 2.5. The DC is working fine now, but when I set the machine up as a MOD 2.5 and go to the calibration screen, it doesn’t display anything on either ammeter. It doesn’t sound like it’s firing. We can turn the Span and Bias all the way to max and nothing.
Gather information
Electrical Schematics
New machine that is being checked for quality control
No mag shot when requested on AC (MOD 2.5)
Part of the system is working (DC)
Step #2 - Note probable causes
When troubleshooting equipment there is always a start point. Figure out what negative events are occurring, then look at the complex systems around those problems and identify key points of failure. If there are multiple possibilities then you must rank them from most likely to least likely, testing each probable cause of the problem in a systematic order. This will bring you closer to the root cause of the problem
"Tackle the root cause not the effect." - Haresh Sippy
An action in one area triggers an action in another, and another, and so on. By tracing back these actions, you can discover where the problem started and how it grew into the symptom you're now facing.
Real life example
Note probable causes
Possible failure | Notes |
---|---|
Mag shot button failure | DC mag shot works, not the issue because they use the same button |
PLC input failure | DC mag shot works, not the issue because they use the same PLC input |
PLC output failure | |
Control Relay failure | |
Firing board - Command signal | |
Firing board - Enable signal | |
Firing board - Power sync | |
Firing board - Circuitry | |
Ammeter display failure | DC mag shot works, not the issue because they use the same Ammeter display |
DVM board failure | |
As you can see we've listed 10 probable causes for the problem described by the customer using the electrical schematics and basic technical knowledge of the control system. The knowledge of the DC system working has already eliminated some of the probable causes of the system failure.
Step #3 - Action and testing
Now that you understand what the problem is and what the most likely cause of the problem is testing can be done. This is when you can adjust, repair, or replace whatever is causing the issue. The goal is to return the system to the way is was before the problem occurred. The success of the troubleshooting process often depends on how well the technicians analysis is up to this point.
Real life example
Action and testing
The firing board is a key component of the control system and can eliminate several other probable causes of system failure with a few key tests.
Command signal analog voltage was tested and was OK.
Enable signal digital voltage was tested and was OK.
There was no voltage present on the DVM board when a mag shot was requested.
Lets look at our updated probable cause list and see where we are at with the remaining possible causes of machine failure.
Possible failure | Notes |
---|---|
PLC output failure | With the command signal and enable signal to the firing board working as intended we can eliminate a PLC failure |
Control Relay failure | With the enable signal to the firing board working as intended we can eliminate a relay failure |
Firing board - Command signal | Tested OK |
Firing board - Enable signal | Tested OK |
Firing board - Power sync | |
Firing board - Circuitry | |
DVM board failure | No incoming signal voltage |
Now we can see with process of elimination that there are two more probable causes to work through to get to the root cause of the problem. With our knowledge gained in step 1 of the troubleshooting process, we can assume for now that the firing board and DVM board are OK from the factory. This leaves us with the "Firing board - Power sync" as being the next likely cause of the problem.
When checked it was found that swapping the AC sync, flipping the phase of the AC power, did in fact solve the problem when tested. This would be an unlikely cause of the problem out in the field, but being a new untested system this became a likely cause of the problem with the information found in our troubleshooting process.
One of the most important rules of troubleshooting is to never assume anything.
What if it didn't solve the problem? Then it would be time to swap out components to see if that changed anything in the process. Test, check, and re-test until you can prove that a device is not the cause of your problem. Too many times a brand new product has been found defective out of the box, or a sensor led is working but not sending the correct voltage to a PLC input and therefore is defective. If you assume that a device is working it may be a long trip down the rabbit hole to ultimately find that it was wrong to assume anything.
Step #4 - Document the process
The final step is to document all steps taken. This ensures other troubleshooters will know what to do if the problem happens again. It's critical to document both the solution and the fixes that didn't work to provide a comprehensive record of the incident. Documentation will also help in creating troubleshooting checklists to quickly identify and fix potential problems.
Real life example
Document the process
Problem solved, time to pack up and go home.....
Not so fast, one of the most important steps is to document the process used to find the problem. Simply by keeping good records you can save yourself time and also be able to track trends. Why do that? Simply having a good record of problems found can lead to improved processes and documentation so that the problem can be eliminated in the future. Common product failures can be designed out of systems, new more reliable product can be found, or service items can be kept on hand for future machine failures. Only by maintaining a robust documentation process can this be achieved.
Final thoughts
A well thought out process and approach to troubleshooting will save you time and money, not to mention a lot of headaches. Troubleshooting is an iterative, trial-and-error process that is repeated until the issue is fixed. See how we can help you get your or your customers machines back up and running and find creative solutions to reduce downtime.
Comments