Why Stress Must Be Included in Modern RAM Analysis

800 Words, 30 Min Read

AeROS Software Used

Author: Didi Rooscote, Reliability Engineer, PLN Indonesia Power

Coauthor: Hongan Lin, Reliability Data Analyst, AssetStudio

Introduction

In traditional RAM analysis, reliability is treated primarily as a function of time.

We collect failure events.

We calculate MTBF.

We fit statistical distributions.

And we assume the resulting failure intensity reflects the intrinsic behaviour of the equipment.

But what if operating stress is not constant?

What if production itself is varying over time?

Then the MTBF we calculate may not represent intrinsic reliability — it may reflect operating conditions.

Step 1: The Conventional Approach

Functional Diagram of a combined-cycle power plant
Figure 1: Functional Diagram of a combined-cycle power plant

In a combined-cycle power plant example, three Gas Turbines operate in parallel, feeding a Steam Turbine.

Over 3 years, failure events were extracted from the maintenance timeline.

When plotting the failure events over time, an apparent reduction in failure intensity can be observed.

This raises an important question: Has the equipment become more reliable? Or has operating stress changed?

If production decreases over time, stress decreases. If stress decreases, life consumption slows. If life consumption slows, failure intensity appears to reduce.

Similarly, increasing production can produce an apparent worsening reliability trend — even if the intrinsic failure mechanism remains unchanged.

Without accounting for stress variation, RAM analysis can misinterpret operational effects as reliability improvement or deterioration.

From the failure timelines, MTBF can be estimated using: MTBF = T/N. This is standard practice in RAM studies.

However, this calculation assumes that the operating stress during those 3 years was constant. In reality, it was not.

Failure timelines of the 3 Gas-Turbine and 1 Steam-Turbine
Figure 2: Failure timelines of the 3 Gas-Turbine and 1 Steam-Turbine

Apparent Reliability Trend — Real Reliability Change?

When the observed failure intensity decreases, we must ask whether reliability has truly improved.

Operational stress can fall over time as production is trimmed, which makes failures appear less frequent.

The observed change in failure intensity may be due to life consumption slowing under reduced stress.

In this example, the same intrinsic failure mechanism can produce different failure patterns depending on production.

Reliability metrics that ignore stress are therefore at risk of misinterpretation.

A constant MTBF assumption is only safe if the underlying operating stress was unchanged.

In practice, production variation is common in power generation portfolios.

That means the RAM analyst must separate intrinsic reliability from stress-driven effects.

If stress is not modelled, the trend may look like a reliability gain or deterioration without any change in the equipment itself.

This is especially important when using RAM for maintenance planning or investment decisions.

MTBF = T/N

The failure timeline can look smoother simply because the plant is operating at lower stress.

In other situations, higher output can make reliability appear worse even though the component has not changed.

Proper stress-aware modelling helps avoid these misleading conclusions.

The Missing Variable: Production as Stress

The production rates have impact on failure intensity
Figure 3: The production rates have impact on failure intensity

During those 3 years: Total production from the three Gas Turbines was 2,300,000,000 kWh. The average production rate per turbine was approximately 30.3 MW.

This average production rate becomes the reference stress.

We are making the assumption that, if the turbines had operated continuously at 30.3 MW/hour, we would expect the same MTBF as observed historically.

This value is therefore defined as the Design Flowrate in AeROS.

LSR (Life-Stress-Relationship) is checked to indicate that asset reliability will change according to the LSR model. If unchecked, the corresponding asset reliability is not affected by production variation.

The setting Design Flowrate = 30.3, means the MTBF is derived at the production rate of 30.3 MWh/hour. You may decide to simulate at a different production level, and the model will adjust the reliability parameters according to the LSR model specified.

MTBF of Gas-Turbine sub-system operating at 30.3 MWh/hour
Figure 4: MTBF of Gas-Turbine sub-system operating at 30.3 MWh/hour
The Design Flowrate and Life-Stress-Relation (LSR) settings
Figure 5: The Design Flowrate and Life-Stress-Relation (LSR) settings

In most RAM studies, reliability is treated as a function of time. But in real production systems, reliability is strongly influenced by how equipment is operated.

Brownfield Assumption: Intrinsic Failure Is Constant Under Constant Stress

For brownfield systems, it is often reasonable to assume that intrinsic asset-level failure behaviour can be approximated as exponential under constant stress.

At asset level, multiple failure modes combine to produce approximately constant hazard behaviour.

This assumption is valid — but only under constant stress.

When stress varies, the observed failure intensity no longer reflects intrinsic behaviour alone.

It reflects the interaction between stress and failure intensity.

Failure points can be adjusted according to actual stress history using cumulative damage model.

The failure intensity changes with stress (production rate)
Figure 6: The failure intensity changes with stress (production rate)

This allows the engineer to distinguish between:

  • Intrinsic reliability behaviour
  • And stress-driven variation

The Stress Aware Approach

For high-level RAM modelling, here is the concept of stress-aware RAM approach:

  1. Find the MTBF, and the average production rate (Figure 4).
  2. Apply Life-Stress-Relationship model (Figure 5).
  3. Run the simulation:
    • For constant throughput, specify the desired production rate (stress).
    • For non-constant throughput, AeROS provide a construct (Profile Node) for user to define the production profile. See Figure 6 for a conceptual illustration.
Stress-responsive Failure behaviour
Figure 7: Stress-responsive Failure behaviour

In our previous article "The Hidden Cost of Load Sharing in Production Systems," I also mentioned that stress changes in Load-Sharing configuration. Therefore, even in a constant production case, an asset's reliability can still be affected by maintenance events of other assets.

Why This Matters at Enterprise Level

In modern power generation environments:

  • Dispatch levels fluctuate.
  • Renewable integration changes load profiles.
  • Operational strategies evolve.
  • Production targets vary over time.

If stress is ignored in RAM modelling:

  • MTBF may be biased.
  • Reliability trends may be misinterpreted.
  • Lifecycle cost analysis (LCCA) may be distorted.
  • Maintenance planning may be misaligned.

At portfolio scale, these modelling assumptions can materially affect budget forecasts and asset replacement strategies.

Reliability is not independent of how assets are operated.

Production strategy influences degradation rate.

From Static RAM to Stress-Aware Reliability

Traditional RAM answers:

"What is the availability?"

Stress-aware RAM answers:

"How does operating strategy influence life consumption?"

Given failure timelines and production history, we can derive reference stress.

AeROS integrates:

  • Operating profile
  • Reference stress normalization
  • Life-Stress Relationship (LSR) modelling
  • Cumulative damage theory

This transforms RAM from a static statistical exercise into a dynamic decision-support tool.

The Strategic Message

If production changes, reliability changes.

Ignoring stress may produce clean statistics — but incomplete insight.

In asset-intensive industries, incomplete insight is costly.

Modern RAM analysis must reflect operational reality.

Because in today's energy systems, production is not constant — and reliability should not be modelled as if it were.

Related Articles

- End -