Minutes of Study Group Meeting, 2017-12-18

Meeting called to order: 11:04 AM EST

The slide references relate to the pack used during this meeting, located here: http://files.sjtag.org/StudyGroup/SG_Meeting_18.pdf

1. Roll Call

Ian McIntosh (Leonardo MW Ltd.)
Heiko Ehrenberg (Goepel Electronics) (joined 11:07)
Eric Cormack (DFT Solutions Ltd.)
Brian Erickson (JTAG Technologies)
Peter Horwood (Firecron Ltd.)
Bill Huynh (Marvell Inc.)
Joel Irby (ARM)
Mukund Modi (NAVAIR Lakehurst)
Jon Stewart (Dell)
Carl Walker (Cisco Systems)
Ed Gong (Intel Corp.)
Dilipan Jayachandran (Schweitzer Engineering Laboratories, Inc.) (joined 11:07)
Russell Shannon (NAVAIR Lakehurst)
Michael D. Sudolsky (Boeing)
Louis Ungar (ATE Solutions)
Sivakumar Vijayakumar (Keysight)

By email (non-attendees):

Terry Duepner (National Instruments)
Bill Eklow (Retired)
Richard Pistor (Curtiss-Wright)
Naveen Srivastava (Nvidia)
Brad Van Treuren (Nokia)

2. IEEE Patent Slides

  • {Slides 5-9}

3. Review and Approve Previous Minutes

  • {Slide 10}
  • December 11
    • Draft circulated 12/11/17
    • No corrections noted.
    • Brian moved to approve, seconded by Eric, no objections or abstentions. Approved.

4. Review Open Action Items

  • {Slide 11}
  • [10.2] Brad will draft a definition for "boundary".
    • ONGOING.
  • [14.1] ALL: Develop Purpose description.
    • ONGOING.
  • [17.1] Louis: Search for any standards that address what constitutes a fault (or pass) at system level.
    • ONGOING.
    • Doesn't really appear to be anything. Seeking some response via LinkedIn.
    • Possible avenue via FMECA/FMEA combined with reliability assessments if the FMECA is done during the design (it often isn't).

5. Discussion Topics

a) Aim 'A' - "Health reporting": What are people really expecting?

  • {Slides, 12-13}
  • (The slides capture Ian's "seed" questions and Terry's responses, from emails).
  • At a very basic level, current BIT only suggests "this box is good" or "this box is bad": Really want "this card (or this part of a card) is good/bad": Pre-diagnosis before sending to the repair line. Need a deeper level of diagnostics at the aircraft, greater confidence that the right part is indicted as it can take 4-5 hours to remove a box. Full test of a box can take several hours so being able to focus on the fault area can speed maintenance up.
  • JTAG is often known to be available but needs to be able to communicate up to a higher level.
  • BIT is often defined by the contract in terms similar to "Detect and isolate 95% of all single faults in the agreed fault catalogue to a single LRI/FRU, 98% to two LRIs and 100% to three LRIs". "Agreed Fault Catalogue" will exclude difficult to diagnose faults so that the numbers are achievable.
  • More information can help with intermittent faults.
  • Collecting the data may require an alternate data source than BIT.
  • It may be the case that manufacturers already monitor/collects additional data on faults for their own test and diagnostics, but don't present it in the "public" interface the customer sees because it is not contractually required.
  • There is a subtle issue in that BIT is often mistaken to prove that the unit is working, but a status of "good" doesn't mean there is no defect. BIT should be oriented to finding faults.
  • Easy to convince ourselves that running a quick BIT test is enough to say the unit is good.
  • This needs to be stated in the original requirement placed by the customer - Firecron have demonstrated doing this kind of thing with QinetiQ around 2006, so it is possible but needs to be in the specification so that the system gets architected correctly.
  • How much of this can you capture in a generic standard and how much is driven by the specifics of the product/applications?
  • If we had a perfect BIT then we might not even need this discussion, but if we can sample some additional data or gain some additional JTAG access then it may be worth levying some additional requirements.
  • We need better tools to direct the maintenance. Some faults may occur under vibration. It's useful to collect any environmental information at the time of a fault: It may be a factor in NFFs when an item is returned as the test is unlikely to replicate all the conditions.
  • Adding additional sensors can add to cost and weight, which the customer is often sensitive to, but leveraging JTAG at system level doesn't require much new hardware. Boeing feel they're doing pretty well with current vibration sensing in their Vehicle Health monitoring.
  • Collecting that environmental data can also protect the vendor in the case that product is used outside of its specified conditions. You can present an argument that the vendor as well as the customer will benefit - a parallel to the first person to benefit from incorporating DFT in a board design being the designer himself, when he has to debug the first board out of manufacture.
  • There can be a disconnect between design and manufacture and the field and what testing is or can be done at each.
  • There is no intent to limit the design - want to know which of 8 cards is faulty, don't want to specify how that is done.
  • Common response from vendor is "If you want this level of DFT then it'll cost money". A response might be "Well, if that's the case then show us that as result we'll be saving money".
  • MIL-STD-2165A addressed a lot of testability planning and management but was dropped. That was probably an Air Force document, Navy is working to update MIL-STD-1814 on Integrated Diagnostics. However many corporations may have their own guidance based on MIL-STD-2165A, otherwise it could be seen as "an imposition". Some commercial customers are prepared to pay more for "health-enabled" systems.
  • Commercially, “Health Management Enabled” units are preferred since numerous Maintenance Improvements are supported:
    • Suppliers sell more units vs. competitors that are not similarly "Enabled",
    • Also, suppliers often charge more relative to System data reporting improvements (enabled via software).
      (On the Military front, Flexible Sustainment contracts and “paying for performance” are striving to achieve similar results)

6. Today's Key Takeaways

  • {Slide 14}
  • BIT needs to support diagnostics to a finer granularity at the first level, not just an assertion of "good/bad".
    • Provide meta data (e.g. environmental conditions) associated with the fault occurrence.

7. Glossary Terms from This Meeting

  • BIT - clarify in line with the above.
  • BIST - distinct from BIT in that is a self-test that does not require any additional resources other than the tested item itself.

8. Topic for next meeting

  • TBD (will be advised in the Calling Notice on Wednesday January 3).

9. Schedule next meeting

  • January 8, 2018.

10. Reminders

  • None.

11. Any Other Business

  • The software for the forums will be upgraded during the break. Appearance may change but functionality will remain the same.
  • Consider possible candidacy as an officer of any future working group.  The offices are typically Chair, Vice-chair, Secretary and Editor. It is required that all officers of working groups are members of the Standards Association.

12. List New Action Items

  • None.

13. Adjourn

  • Brian moved to adjourn, seconded by Eric.
  • Meeting adjourned at 11:55 AM EST

Respectfully submitted,
Ian McIntosh