Decisions Blog

The Rete Algorithm in Decisions and RULE.net

 

What is the Rete Algorithm?

Your first question is likely – how do I say “Rete”? Well, a quick search leading to several dictionaries suggests “Reet” is appropriate while “Ree-tee” and and other pronunciations are less common.

The goal of a Rete Algorithm is pretty simple, to reduce the number of rule executions to get to the same eventual result.  By evaluating the structure of the rules, inferences can be made to allow some rules to not execute in order as they are unnecessary.

A simple non technical example:

Fact:

  • Bob stole from Joe

Rules:

  • Stealing is a crime
  • Do not do business with someone who is a criminal
  • Do not lend money to someone we do not do business with

Execution

  • Data: Bob stole from Joe
  • Result: Do not lend money to Bob

As you can see, we can easily skip the second rule because the we can infer or remember Stealing = crime = no business relationship = no lending.  

The Rete Algorithm is a design methodology that sacrifices memory for speed. It optimizes execution times by reducing the number of rules or data a transaction requires with an intelligent network or structure of rules and data. The fundamental idea is simple, while implementations can grow to be quite complex.

The principle essentially boils down to creating the leanest decision tree possible at runtime. In a project where you might have millions or billions of business rule transactions – designing their execution in such a way that calls for each and every rule to run will add costly overhead to total execution times. The Rete Algorithm optimizes the rule execution pattern with “test rules” that are placed early and strategically in any execution in order to determine if other data or rules need to be run/accessed or not.

Thoughts on the RETE Algorithm

Obviously the simple example above is a overly trivial example of optimization and does not do justice for the types of large rule sets that can exist in real world scenarios.  It was designed in the 70’s and 80’s when computing processing power was a very small fraction of what it is today – however, rule execution optimization is still a very relevant topic.

A real implementation of RETE does this optimization automatically and chooses the rules to be executed for you at ‘compile’ or analysis stages.  This introduces a bit of ‘black box magic’ into the rule environment.  How much does the business user need to understand and/or trust automatic optimization of the rules in order to have confidence that all the rules are being properly executed.  

All the data for the rules must be able to be available in memory to be evaluated.  If you take a large data set like all possible medical procedures and insurance company policies related to them, the data set might be quite large and the cost of optimizing the patterns initially very expensive.  

The data evaluated also tends to be best if it is fairly flat data.  It is easy to optimize rules like Income > 100, but much harder to evaluate for instance the contents of an investment portfolio looking for particular problematic items.

RETE is only really necessary when the number of rules is very large or there are some types of rules that are really expensive to execute.  

Decisions and the Rete Algorithm

Decisions embraces all of the goals of the RETE Algorithm, however, our take on how we achieve these goals is a bit different.  As we see it, the benefits of RETE are

  • Not executing rules that are not needed.
  • Remembering expensive rule executions so they can be applied to later data executions.

However, the there are elements usage of rules in business solutions, there are a few additional goals that we also wanted to embrace

  • Avoid black box magic by making the relationships between rule dependency to be very explicitly declared and understood.
  • To support data structures that are much more complex or deeper.  RETE works best with fairly flat and non dynamic data structures.
  • Expanded vocabulary of rule verbs to allow logic to be extended into complex business concepts.  Instead of [Person] stole we might have a verb like [Person] is found in list of indicted or convicted people.  All the data does not have to be present at the beginning of rule execution, as rule verbs are flexible enough to incorporate additional data and/or highly tuned and encapsulated logic.

 

Decisions has built in rule collection mechanisms that supports ‘test rules’ to stop evaluation of rule conditions based on early evaluation.  These rule sets can be nested to form a dependency tree.  Using this methodology the rule dependency is very clear and will be understood by the analyst creating the rules.  

Further optimization is done by allowing rules to have contextual based memory of execution and results (including consideration for the data being considered in the rule).  While this means that the first execution of a rule may pay the cost of executing this rule, each following execution can benefit from this.

Decisions 4.0 (Due out in Q2, 2016) will include additional dependency tree and dependency discovery tools to help further optimize rule execution.

 

 

Supporting Sources/Authors: 

https://en.wikipedia.org/wiki/Rete_algorithm

https://techondec.wordpress.com/2011/03/14/rete-algorithm-demystified-part-2/

Carl Hewitt, Chief Architect at Decisions

Heath Oderman, CTO at Decisions