Minutes of Study Group Meeting, 2017-12-04

Meeting called to order: 11:03 AM EST

The slide references relate to the pack used during this meeting, located here: http://files.sjtag.org/StudyGroup/SG_Meeting_16.pdf

1. Roll Call

Ian McIntosh (Leonardo MW Ltd.)
Heiko Ehrenberg (Goepel Electronics) (joined 11:07)
Eric Cormack (DFT Solutions Ltd.)
Terry Duepner (National Instruments)
Bill Eklow (Retired)
Brian Erickson (JTAG Technologies)
Peter Horwood (Firecron Ltd.)
Bill Huynh (Marvell Inc.)
Joel Irby (ARM) (joined 11:26)
Rajesh Khurana (Cadence Design Systems) (joined 11:04)
Naveen Srivastava (Nvidia)
Jon Stewart (Dell) (joined 11:05)
Brad Van Treuren (Nokia) (joined 11:05)
Carl Walker (Cisco Systems)
Ed Gong (Intel Corp.)
Craig Stephan (INTELLITECH Inc.) (joined 11:05)
Louis Ungar (ATE Solutions)

By email (non-attendees):
---

Excused:
Russell Shannon (NAVAIR Lakehurst)
Sivakumar Vijayakumar (Keysight)

2. IEEE Patent Slides

  • {Slides 5-9}

3. Review and Approve Previous Minutes

  • {Slide 10}
  • November 27
    • Draft circulated 11/27/17
    • No corrections noted.
    • Terry moved to approve, seconded by Brian, no objections or abstentions. Approved.

4. Review Open Action Items

  • {Slide 11}
  • [10.2] Brad will draft a definition for "boundary".
    • ONGOING.
  • [14.1] ALL: Develop Purpose description.
    • ONGOING.
  • [15.1] Bill Eklow: Invite Al Crouch to talk about managing security in a JTAG environment.
    • Bill has emailed Al, but Al has not replied.
    • COMPLETE.

5. Discussion Topics

a) Scope, Purpose and Need:
    Continue developing concepts of what the standard(s) needs to address.

  • {Slide 12}
  • {Forum threads shared: http://forums.sjtag.org/viewtopic.php?f=3&t=774&p=1244#p1244 and http://forums.sjtag.org/viewtopic.php?f=3&t=740&p=1243#p1243}
  • Concern that the breadth of discussion may be expanding beyond what can be managed by a single group. For example, the security need could be a standard in its own right. But if we don't have access then we have nothing. Perhaps consider these as "qualifiers" within the Need but not part of the Scope for the standard.
  • There is probably not a good fault metric for the system level and maybe not even for the board level. Who would actually find value in these kinds of metrics? We need to ensure that we have coverage of critical design features. This maybe gets into custom aspects of each design that would be difficult to transfer into generic measures.
  • There is a difference in the categories of faults between structural faults and functional faults. Coverage has been a driver behind using embedded Boundary Scan as it is easier to identify the coverage for that than it is for a functional test.
  • It gets harder to identify causality as you go up through the assembly levels.
  • There is obvious value in getting diagnostics that identify to a sub-system but not so much for quantitative metrics on faults. The Need has to be providing a value to the user: A value may be in getting test access to a sub-assembly.
  • In an automobile, you expect the electronics to tell you that it all working properly: It is a functional test but without metrics you don't know if the test covers all the things that could go wrong.
  • A test may show that a connection is made, but when you run through a range of frequencies it may show that it is not made very well. There may need to be some criteria for evaluating functional tests.
  •  Brad tried to search through IEEE standards for anything that related to recent discussion topics; the nearest he could find were:
    • IEEE Std 771-1998 IEEE Guide to the Use of the ATLAS Specification
    • IEEE Std 716-1995 Standard Test Language for All Systems – Common/Abbreviated Test Language for All Systems (C/ATLAS)
    • IEEE Std 1445-1998 IEEE Standard for Digital Test Interchange Format (DTIF)
  • There does not appear to be any IEEE standard on fault metrics probably because it really depends on the design, and there is not any other document known to the group that addresses this.
  • Part of the issue may be that there are not tools to do design checks at the system level such as you may have at the chip level and, to some extent, at the board level. You may have an interconnect diagram to show how sub-assemblies connect together but nothing equivalent to "CAD data".
  • There may be some relevance for the machine learning activities of the 1990s, which seems to having a resurgence: Can you predict where a fault is likely to occur based on history?
  • The price of not checking coverage is No Fault Found (or No Trouble Found) occurrences. An example is cruise controls removed as faulty (at customer's cost) were found to be 94% NFF when returned to the factory. But is it the test they do the right test? Often service people will just swap till they fix the fault.
  • Low levels of NFF may not be examined, but will be attacked if it looks like a recurring problem.
  • If a factory test is poor and allows escapes into the field then NFFs on return is simply to be expected. There is also the case where a returned board may be reprogrammed with the latest firmware prior to retest, which corrects the condition that caused the original fault report.
  • Are NFFs a big problem? Wouldn't industry be screaming about it? US DoD see it as a problem, perhaps $20bn of cost associated with NFFs.
  • A few NFFs are not a problem in lower volumes.
  • Design for Testability in the product architecture is essential, otherwise you can have a case where a failure report indicts 5 boards equally. Can help to track what was done to correct particular error codes on previous occurrences.
  • Another place to look is the work Huawei were doing on analysis of failures at system level.
  • Potentially a huge market for machine learning in future systems.
  • In low volumes, it may be difficult to get much/any data.
  • Do we have any high volume manufacturers? Perhaps look into any NFF data on high volume products, e.g. laptops - Jon can ask {ACTION}.
  • Do Keysight (Duane Lowenstein?) have anything on system modelling? They had similar kinds of analysis at other levels previously - Jon/Louis can try to find out {ACTION}.
  • Should consider the Needs 'C' and 'D' in Louis' forum post  as things that could affect the success of SJTAG (maybe dependencies); record/acknowledge them but then re-focus on the main topics.
  • In higher volumes you may be inclined to spend more on improving test. How much you invest in test and diagnostic aids is influenced by Return on Investment and how much margin is in the product price. For high integrity systems (e.g. medical, some mil/aero) you may put a lot of effort into test, but price and margin will both be high for such products. All these are things that affect how much analysis you do.
  • There is the potential that tests that work well for a board in one system configuration may not work at all for that board in another configuration.
  • For a lot of customers, the key issue is "availability", so fault tolerance and redundancy are factors.

6. Today's Key Takeaways

  • {Slide 13}
  • Should list/acknowledge things that can affect success/effectiveness of SJTAG.
    • Possibly score them.
    • Identify which should be deferred to other groups.

7. Glossary Terms from This Meeting

  • None.
  • Carried over:
    • Test Salvage - Limited re-use of a test due to imposition of additional constraints.

8. Topic for next meeting

  • Scope, Purpose, Need - continue developing concepts of what the standard(s) needs to address.

9. Schedule next meeting

  • December 11.
    • Dec. 11 and 18 are the last meeting dates this year.

10. Reminders

  • None.

11. Any Other Business

  • None.

12. List New Action Items

  • [16.1] Jon: Enquire if there is data on NFF rates in higher volume product areas (e.g. laptops).
  • [16.2] Jon/Louis: Approach Duane Lowenstein to enquire whether Keysight have anything on system modelling.

13. Adjourn

  • Eric moved to adjourn, seconded by Terry.
  • Meeting adjourned at 12:11 PM EST

Respectfully submitted,
Ian McIntosh