Stateless Workers are microservices that can be scaled up or down to meet demand. In essence, a Worker should not be aware (or need to be aware) of any other Worker or its surrounding system. This way additional “clones” of a Worker can be created or destroyed at will to provide scaling without affecting other Workers.
The following modules are required to create a Document worker. Each contains its own pom.xml with its own dependencies and plugins:
These modules are generated from the worker-document-archetype:
worker-example
module contains the worker itself and an md documentation explaining the service use of the worker.
worker-example-container
module is for building the Docker image of the worker and pushing the image to Docker. The module starts a container for RabbitMQ, the worker, and runs the worker acceptance integration testcases via the worker-document-testing
module.
worker-document-testing
module’s DocumentWorkerTestControllerProvider class to generate or run worker integration testing testcase files.A Maven Archetype is a template which you can base a project on.
Excerpt from Apache Maven Archetype Introduction :
"Archetype is a Maven project templating toolkit. An archetype is defined as an original pattern or model from which all other things of the same kind are made."
You can create the foundations of a new Document Worker project by using the worker-document-archetype
project.
The generation of a new project from the Document Worker Archetype will contain some basic functionality.
It performs a simple lookup on values from the fields passed in from a Document. It looks for a field name of ‘REFERENCE’, uses its values to retrieve values from a Map, and replace any existing values in the ‘UNIQUE_ID’ field, with the retrieved values.
A new Document Worker aggregator project generated from the Document Worker Archetype has a set of properties that are shared between its submodules:
The following subsections provide instructions on how you can use Maven Command Line Interface (CLI), IntelliJ Integrated Development Environment (IDE) or NetBeans IDE to create the components of a Document Worker from the Document Worker Archetype. Note: You must replace WORKER-DOCUMENT-ARCHETYPE-VERSION with a valid version of Worker-Document-Archetype.
The Maven CLI offers developers the ability to generate projects from archetypes with the mvn archetype:generate
command. The location
where you run the command will be where the project is created.
Generate the new Document Worker Aggregator from the worker-document-archetype
with the following Maven command:
mvn archetype:generate -DarchetypeVersion=WORKER-DOCUMENT-ARCHETYPE-VERSION -DarchetypeArtifactId=worker-document-archetype -DarchetypeGroupId=com.github.cafdataprocessing
The CLI will prompt you for artifactId, groupId, version (default suggestion is 1.0.0), package (default suggestion is the groupId, you should however adjust this to include the worker’s purpose) and workerName properties required for the new Document Worker project. See Figure 1.
Figure 1
If you are satisfied with the properties you have set, confirm these by typing ‘Y’ else if you are not satisfied type ‘N’ or any other character to re-enter property values. After confirming your properties Maven will generate the new Document Worker Aggregator project which will contain the following submodules:
<artifactId>
- submodule containing the Worker’s backend code.<artifactId>-container
- submodule containing the Worker’s container and configuration.IntelliJ offers developers the ability to generate projects from archetypes via its GUI.
Generate the new Document Worker Aggregator from the worker-document-archetype
by following these instructions:
The foundations for your new Document Worker is now set up. The generated project will contain the following submodules:
<artifactId>
- submodule containing the Worker’s backend code.<artifactId>-container
- submodule containing the Worker’s container and configuration.NetBeans offers developers the ability to generate projects from archetypes via its GUI.
Generate the new Document Worker Aggregator from the worker-document-archetype
by following these instructions:
The foundations for your new Document Worker is now set up. The generated project will contain the following submodules:
<artifactId>
- submodule containing the Worker’s backend code.<artifactId>-container
- submodule containing the Worker’s container and configuration.The <artifactId>-container
submodule contains a default JavaScript configuration file.
Configuration variables check an environment variable for a value. If no value is found for an environment variable a default value is set if applicable.
The default Document Worker configuration file checks for values as below;
Property | Checked Environment Variables | Default |
---|---|---|
outputQueue | CAF_WORKER_OUTPUT_QUEUE |
Use environment variable CAF_WORKER_BASE_QUEUE_NAME + “-out”. Or use environment variable CAF_WORKER_NAME + “-out”. Else use “worker-out”. |
failureQueue | CAF_WORKER_FAILURE_QUEUE |
undefined |
threads | CAF_WORKER_THREADS |
1 |
maxBatchSize | CAF_WORKER_MAX_BATCH_SIZE |
2 |
maxBatchTime | CAF_WORKER_MAX_BATCH_TIME |
5000 |
Information on worker-document and its modules worker-document-shared, and worker-document-testing, which the archetype utilizes can be found here.
At the time this guide was written with: