top of page
Search
Zach McClain

Troubleshooting the Right Way

When problems are found in test or in the field it can be incredibly frustrating, especially if the problem’s behavior is inconsistent or if it happens once and then goes away completely. Is the issue due to some external circumstance like water intrusion or foreign object debris, or is the issue suggesting something more diabolical like a design issue? Troubleshooting a problem the right way is vitally important in order to resolve issues totally and completely.


A common mistake is to chalk a problem up to chance or equipment failure and just move on without finding the real cause. Or another mistake can be to find direct cause—for example a blown capacitor—but stop there without finding the root cause—what caused the capacitor to blow. Finding root cause is necessary to ensure the product design is sound, to confirm the problem won’t happen again if a repair of the failed part is made, and to make sure the issue isn’t generic with potential to affect other units that may already be built or even in the field.


Here is an example of one challenging troubleshooting experience I had.


The Faulty Timer


On one Avionics box development I was involved with a problem arose where occasionally the box would misbehave and fail a test. The issue only happened when the box was put into a certain mode where many box functions were operating simultaneously. The issue did not occur when each function was tested individually. The issue also did not occur reliably when put into that certain mode that revealed the issue, it only happened occasionally.


To address this problem our team used one key failure investigation tool, the Fault Fishbone. After looking at all of the obvious potential causes but not finding the problem, the team developed a Fault Fishbone and systematically worked through all potential causes to find cause. A test would be developed to investigate a potential cause from the fishbone, the test would be performed, and the test results analyzed. The results might tell us the potential cause being investigated is unrelated to the issue, and it would be marked as unlikely or not credible. Or the results may confirm the potential cause does have a correlation to the issue, and it would be marked as credible or likely. Sometimes the results are not so cut and dry, and this might suggest the test performed was not set up the right way to begin with, or the potential cause being investigated might be slightly different than first perceived. Whatever the test result, information is gained that helps to narrow down where the underlying issue might be and helps lead to more tests to continue the process. By using this approach, the team was able to systematically work through many potential causes until we had it narrowed down enough to finally find the true issue.


After using many different test programs to alter the FPGA behavior, we finally discovered the direct cause to be a logic bug in the firmware. A timer in the FPGA design was not behaving as intended and caused collisions on common signal paths connected to the device. The issue would only occur occasionally because it depended on timing: where the misbehaving timer was at in in its count with respect to other events like incoming and outgoing traffic. Once the direct cause was confirmed, the fix was a straightforward correction of the logic bug in the firmware. The corrective action involved a more thorough review to determine how the bug got into the logic in the first place, and to correct any other places that might be susceptible to similar bugs.


McClain Electronics Engineering Can Help


McClain Electronics Engineering has lots of experience troubleshooting difficult failures, ensuring we can find root cause and impellent a complete corrective action. We have lots of tools at hand to help with a challenging troubleshooting effort, and we’ll ensure the troubleshooting is performed the right way.

Let us know if we can help with any of your product design and development needs.


Zach McClain

McClain Electronics Engineering


63 views0 comments

Recent Posts

See All
Subscribe to Our Newsletter for periodic updates

Thanks for submitting!

©2021 by McClain Electronics Engineering, LLC.

bottom of page