Drools stream integration

2023-05-13,,

This passage discusses how to integrate a provided drools package into datastream application.

Packaging:
If a maven project is provided by customer. In this case, you need to ensure that the pom file contains the following:

<dependencyManagement>
    <dependencies>
      <dependency>
        <groupId>org.drools</groupId>
        <artifactId>drools-bom</artifactId>
        <type>pom</type>
        <version>xxx</version>
        <scope>import</scope>
      </dependency>
    </dependencies>
  </dependencyManagement> 
  <dependencies>
    <dependency>
      <groupId>org.kie</groupId>
      <artifactId>kie-api</artifactId>
    </dependency>
    <dependency>
      <groupId>org.drools</groupId>
      <artifactId>drools-compiler</artifactId>
      <scope>runtime</scope>
    </dependency>
    <dependency>other dependencies</dependency> 
  </dependencies>
  <build>
    <plugins>
      <plugin>
        <groupId>org.kie</groupId>
        <artifactId>kie-maven-plugin</artifactId>
        <version>xxx</version>
        <extensions>true</extensions>
      </plugin>
    </plugins>
   </build>

In addition, a file kmodule.xml must be added to src\main\resources\META-INF folder. A minimum kmodule.xml likes like the following.

<kmodule xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xmlns="http://www.drools.org/xsd/kmodule">  
    <kbase name="defaultKBase" default="true" eventProcessingMode="cloud" 
    equalsBehavior="equality" declarativeAgenda="enabled" >
        <ksession name="ksession1" type="stateless" default="true"/>
    </kbase>
</kmodule>  

The default stateless ksession is mandatory.

Rule files can be put in main/resources as normal

The command to create jar file is still mvn package as normal. However, the jar created is a bit different. Here is a screenshot

Note that there is knowledge base cache file and kmodule file in META-INF. Two rule files in main/resources are shifted out into the root folder.

What if the customer does not provide a maven project? I guess the best strategy is to create a maven project by ourselves. If source code is provided, we just import source code into the maven project, otherwise, use customer provided jar as a maven dependency?

Note that kie module is introduced only after drools 6. So I don't think this will work for drools 5 and below. Also, for drools integration in streamtau, we are using the latest version 7.2.1. So whether earlier version like 6.x is fully compatible still remains a question.

Invocation:
Load rules:
First create a KieServices singleton instance.
private final KieServices kieServices = KieServices.Factory.get();

Load the drools package into system:

 protected DroolsDataHolder doLoadDroolsModule(DroolsLoadParam droolsLoadParam) {
        DroolsParameters origParams = droolsLoadParam.getDroolsParam();
        String moduleName = origParams.getModuleName();
        try {
            InputStream is = droolsDataLoader.getDroolsModuleAsStream(droolsLoadParam);
            KieContainer curContainer = DroolsUtils.buildContainer(kieServices, is);
            return new DroolsDataHolder(curContainer);
        }
        catch (Exception ex) {
            logger.error("Error loading drools " + moduleName, ex);
        }
        return null;
    }

DroolsDataLoader is an interface that is designed to loads drools package as stream (via either file system or restful interface)
DroolsUtils is the utility class that builds a KieContainer from stream.

public static KieContainer buildContainer(KieServices kieServices, InputStream stream) throws Exception {
        Resource wrapped = kieServices.getResources().newInputStreamResource(stream);
        KieModule curModule = kieServices.getRepository().addKieModule(wrapped);
        ReleaseId releaseId = curModule.getReleaseId();
        logger.info("Release id generated for module: {}", releaseId);
        KieContainer kContainer = kieServices.newKieContainer(releaseId, DroolsUtils.class.getClassLoader());
        return kContainer;
    }

The returned DroolsDataHolder is merely a wrapper of KieContainer

public class DroolsDataHolder {
    private final KieContainer kieContainer;

    public DroolsDataHolder(KieContainer kieContainer) {
        this.kieContainer = kieContainer;
    }

    public KieContainer getKieContainer() {
        return kieContainer;
    }

    public void destroy() {
        kieContainer.dispose();
    }
}

The loaded DroolsDataHolder will be cached unless rule is changed, which triggers a reload operation

public DroolsDataHolder getOrLoadDroolsModule(DroolsLoadParam droolsLoadParam) {
        DroolsParameters origParams = droolsLoadParam.getDroolsParam();
        String moduleName = origParams.getModuleName();
        dataLock.readLock().lock();
        try {
            DroolsDataHolder curHolder = containers.get(moduleName);
            if (curHolder != null) {
                return curHolder;
            }
            dataLock.readLock().unlock();
            dataLock.writeLock().lock();
            try {
                return doUpdateDroolsModule(droolsLoadParam);
            }
            finally {
                dataLock.readLock().lock();
                dataLock.writeLock().unlock();
            }
        }
        finally {
            dataLock.readLock().unlock();
        }
    }

Invoke the drools module:
In stream environment, only stateless drools knowledge session is supported for now. The main reason is that stream is executed in a distributed environment. The session will be created on multiple JVMS, so it is virtually hard to share all the facts globally. Evaluating the rule is quite simple, it is composed of 3 steps:

  1. convert stream data to rule input pojo
    public Class<?> getRulePojoClass(DroolsLoadParam droolsLoadParam, String inputPojoClassName) {
        DroolsParameters origParams = droolsLoadParam.getDroolsParam();
        String moduleName = origParams.getModuleName();
        DroolsDataHolder curDataHolder = this.getOrLoadDroolsModule(droolsLoadParam);
        if (curDataHolder == null) {
            throw new IllegalArgumentException("No drools module found by name: " + moduleName);
        }
        try {
            ClassLoader cl = curDataHolder.getKieContainer().getClassLoader();
            Class<?> inputPojoClass = cl.loadClass(inputPojoClassName);
            return inputPojoClass;
        } catch (Exception e) {
            throw RtException.from(e);
        }
    }

The good thing about drools module is that it provides a self contained class loading environment. So third party jar dependencies are unlikely to cause conflict with the outside runtime environment. However, when we build an input event to drools engine, we need to use the KieContainer's class loader to find the input event class referenced in rule.

  1. build a stateless kie session and invoke the rule

    public List<Object> evaluate(DroolsLoadParam droolsLoadParam, List<Object> facts) {
        if (logger.isDebugEnabled()) {
            logger.debug("Start evaluating drools, input is: {}, module name is: {}", Arrays.asList(facts),
                    droolsLoadParam.getDroolsParam().getModuleName());
        }
        DroolsParameters origParams = droolsLoadParam.getDroolsParam();
        String moduleName = origParams.getModuleName();
        DroolsDataHolder curDataHolder = this.getOrLoadDroolsModule(droolsLoadParam);
        if (curDataHolder == null) {
            throw new IllegalArgumentException("No drools module found by name: " + moduleName);
        }
        StatelessKieSession curSession = curDataHolder.getKieContainer().newStatelessKieSession();
        curSession.execute(facts);
        return facts;
    }
  2. convert rule evaluation result back to stream data

Under the hood:
Drools class relations

Drools package loading

Things to note:
Drools package can be large and the current approach caches all loaded drools package in memory. The loading time and memory consumption might be a bottleneck of scalability. A better approach will be building a standalone rule server, where it manages rules and exposes a rest api to stream application.

Find out input metadata for rule: it is possible to find out java class of each rule variable. This is useful as a hint to map stream data to rule input.

public static Map<String, Class<?>> getRuleInputMeta(KieBase kieBase,
            String rulePkgName, String ruleName) {
        RuleImpl r = (RuleImpl)kieBase.getRule(rulePkgName, ruleName);
        List<RuleConditionElement> elements = r.getLhs().getChildren();
        Pattern curPattern = null;
        String curId = null;
        ObjectType curObjType = null;
        Map<String, Class<?>> result = new HashMap<String, Class<?>>();
        for (RuleConditionElement nextElem : elements) {
            if (nextElem instanceof Pattern) {
                curPattern = (Pattern)nextElem;
                curObjType = curPattern.getObjectType();
                curId = curPattern.getDeclaration().getIdentifier();
                result.put(curId, curObjType.getValueType().getClassType());
            }
        }
        return result;
    }

Maven shade plugin and drools jar:
To use the drools java api, multiple jars need to be included as maven dependency.

However, the special thing about drools jars is that each one contains a file kie.conf (Eg. drools-core.jar, kie-internal.jar). The default behavior of maven shade plugin is that kie.conf will overwrite each other and causes a runtime error when deploying the shaded jar to flink. Mitigation to this problem is to configure maven shadow plugin parameters properly so that the content of each kie.conf will be appended to the combined file instead of overwritten.

<build>
        <plugins>
                <plugin>
                        <groupId>org.apache.maven.plugins</groupId>
                        <artifactId>maven-shade-plugin</artifactId>
                        <executions>
                                <execution>
                                        <phase>package</phase>
                                        <goals>
                                                <goal>shade</goal>
                                        </goals>
                                        <configuration>
                                        <transformers combine.children="append">
                                                        <transformer implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                                                <resource>META-INF/kie.conf</resource>
                                                        </transformer>
                                                </transformers>
                                        </configuration>
                                </execution>
                        </executions>
                </plugin>
        </plugins>
</build>

《Drools stream integration.doc》

下载本文的Word格式文档,以方便收藏与打印。