This subject area is becoming increasingly important with the introduction of autonomous, intelligent systems. This is derived from the definition of resilience, which states that a technical system responds intelligently to events, both predictable and unpredictable, in order to keep the system functioning within a safe range or return it to that range. This extremely challenging task can only be achieved by developing integrative solutions together with the other research fields and research programs of IHP.
The working group takes a dual approach:
On the one hand, concrete concepts for increasing the resilience of systems, e.g. through improved IT security solutions and reliability concepts, are being investigated. This includes approaches and concepts for the resilient design of sensor nodes and communication protocols, each of which is implemented and analyzed through to realization. These more application-driven sub-aspects are also supported by the other departments in the Resilient Systems field of activity. Within the framework of the theoretical considerations regarding the understanding of resilience as a holistic system concept, an internationally recognized brand core has already been established with the integration of the aspect of cognition as an essential component for dealing with unknown error situations. The core question to be answered in the research group's investigations is how cognition can be integrated into a technical system to enable it to react autonomously to previously unknown situations. This means that the development of resilient systems must additionally consider operating conditions and, in particular, their changes during the devices' lifetime of several years.
The second scientific focus is on examining exemplary metrics for assessing resilience. Determining the resilience of a system is still an unsolved task. For this, generally accepted metrics must be developed and their ability to actually measure resilience must be demonstrated. The challenge here is how to determine the ability of a system to respond correctly to unknown situations. At best, this ability can be tested indirectly. The definition of measures of resilience has been very successfully advanced and will serve as the basis for assessments of resilience mechanisms in further work. This includes testing the level of resilience achieved and methodologies for developing resilient systems. A key aspect here is the development of theoretical models for predicting resilience properties in the development process. The challenge here is that the holistic approach to resilience, from material properties to ASIC design to communication protocols, must be understood and modeled. Sub-metrics may need to be used for better understood aspects and their combination explored and evaluated as the basis for a more complex metric. Another possibility would be to provide metrics with a kind of " blurring " relation, which makes the conditional significance of individual values comprehensible. In addition to the issue of assessing resilience, approaches used to make systems more resilient will be explored. Here there is a clear cross-reference to the topics of the research groups Security Engineering, Hardware Security and Fault Tolerant Computing. The latter two belong to the System Architectures department. The insights gained from designing such "partial solutions" should result in a design methodology, or at least guidelines, that enable system developers to achieve "resilience by design". For this purpose, the individual solutions are to be transferred into a modular system and thus made reusable. Within the framework of the Total Resilience working group, very close cooperation existed with all departments at IHP, which will also be further consolidated and intensified within the framework of the work of the new Resilience Engineering working group. One example is the studies on defect injection in RRAM cells, together with the Materials Research department.
Main targets
- Understand basic means of intelligence
- Empower technical systems to do self assessment and self healing in unexpected situations
- Empower technical systems to react appropriate on new situations
- Develop a design, test and assessment methodology for resilient systems
- Core Concept "Cognition"
Research topics
- Resilience Features
- Security on device and network level
- Efficient implementations of AI means
- Refined design methodology and Tools
- Cognition
- Empower technical systems to invent solutions on their own
Research results
- Design Methodology and Metrics
- First version of design flow
- First approach on defining metrics
- Security and Reliability
- Atomicity principle: ineffective
- AI means to detect attacks against devices
- AI means applied to analyse and improve the design of hardware accelerators