LEGO® Java: Apache Camel Context and Route Basics
If you are a fan of “Enterprise Integration Patterns” or you work in data integration projects, you have probably heard about Apache Camel. If not, this is a good moment to discover it.
And what has Apache Camel to do with LEGOs? LEGOs offer a fixed set of blocks that you can combine to build some working gadgets. In that sense, Camel offers you Java implementations of the EIPs that you can mix and match to get some functionality done, without having to construct the blocks yourself. If you are open and eager to learn new ways of building Java applications, let’s start with some practical Camel examples.
The Minimal Camel Application
There are several different ways of deploying Camel: as plain Java application, as Spring-enabled application, as Java Web application, as OSGi module, etc. Let’s begin with a plain Java application and make our life easier by using Maven to grab the dependencies (the default option for Camel).
You will need a new Maven-based project and a “main” Java class:
public class Part1 {
public static void main(String[] args) throws Exception {
DefaultCamelContext camelContext = new DefaultCamelContext();
camelContext.start();
Thread.sleep(10000);
camelContext.stop();
}
}
The main method in the “main” class creates a default implementation of a Camel context. It then starts the context, waits 10 seconds and stops it before ending the application. This is an abrupt way to stop the context. For little prototypes, stopping the context improperly can have no side effects, don’t do it this way when deploying a productive application. If you want to learn more about this topic, I would recommend you to read the chapter 13.3 of “Camel in Action”. There, the author explains what is a “Graceful shutdown”, how to do it properly and why.
The Maven pom file with the minimal dependencies required looks like this:
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.canoo.camel</groupId>
<artifactId>part1</artifactId>
<version>1.0</version>
<dependencies>
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-core</artifactId>
<version>${camel.version}</version>
</dependency>
</dependencies>
<properties>
<camel.version>2.6.0</camel.version>
</properties>
</project>
If you run the application, you should observe some messages in the console output indicating that the context has been started and stopped.
So far, so good, but, until now, our Camel application does nothing because we have not created any routes. We could now add a the “Hello World” route, but I am sure that reading the documentation would give you already an idea of how to do it, so let’s think about a more useful example to implement using Camel. We will construct an information retrieval application capable of processing information found in web pages.
First feature: the “web page extractor”
We are going to create a first route capable of downloading the page at a URL and transforming the contents into plain text. Basically, what we want do is grab the HTML, fetch all the paragraph elements in the page body, extract the text they contain and clean them by filtering all pure non-alphanumeric strings.
For that, we need a new class extending “RouteBuilder” that can be applied to the Camel context and that creates the route (or routes). The “PageExtractorRoutes” class would look like this:
public class PageExtractorRoutes extends RouteBuilder {
@Override
public void configure() throws Exception {
from("direct:page_extractor")
.setHeader(Exchange.HTTP_URI, body())
.log("Extracting content from: '${body}'")
.to("http:extractor")
.unmarshal().tidyMarkup()
.log("Html from: '${body}'")
.split(xpath("//body//p/text()"), new SplitterAggregationStrategy("(?s).*[A-Za-z0-9].*"))
.log("Text chunk: '${body}'.")
.end();
}
}
This PageExtractorRoutes class extends RouteBuilder and overrides the “configure” method. This class will be applied later to the Camel context, creating the route in the context. As you can see in the route steps, this route will “listen” in an endpoint called “page_extractor” and pass the message received through to the different steps composing the route. You may also noticed that the endpoint is from type “direct”. This type is the simplest one in Camel and that it is equivalent to a Java synchronous method invocation.
Suppose that in the body of the message we receive the URL of a web page. We want to send this URL to an endpoint of type “http” that will access the page and grab its content passing the body of the message to the next step. The “http” endpoint can be set with the value of URL to access or, in case that the header “Exchange.HTTP_URI” is present in the message, it will use this value as the URL to grab. To put the URL in the expected header, we have used “setHeader” copying the body value. When using a dynamic URL set in the header, the name of the “http” endpoint is irrelevant (in this case we used “extractor”).
Once the “http” endpoint has fetched the content, we have to take care of some details before we can use XPath to select the text of the body paragraphs:
- Some endpoint types in Camel use streaming to avoid loading big pieces of data in memory. This means that the payload of the message is of type “InputStream” and, by default, can only be consumed once. This is the case with the “http” endpoint. If, as in our case, we know that the fetched web content is not going to be big, then we can convert it to a string that is consumable as many times as necessary (you can use “convertBodyTo(‘String.class’)” from the Java DSL) or we can activate the streaming cache using the method “streamCaching()” from the Java DSL at the beginning of the route.
- Because XPath needs well formed XML documents, we need to assure that the HTML content fulfills this condition. By using the “TidyMarkup” marshaller we achieve that and automatically our input stream gets converted to a DOM node that is a perfect fit for the XPath expression in the next step. Tidied markup can also be consumed multiple times.
If everything works as expected, we can now split the DOM document into text chunks. Therefore, we use a splitter (“split” Java DSL method) that takes as first parameter an expression and as second an “AggregationStrategy” (this is only one of the many forms of the split method). The expression is responsible for returning the pieces that the split method controls, and the aggregation strategy has in this case two roles: to merge the split pieces in the required way and to filter the unwanted pieces (non-alphanumeric text chunks in our case). The “.end()” method call belongs to the splitter and is used to indicate where the pieces should be joined, returning finally a unique message whose body is the aggregation of the previously split pieces.
Ideally, we would separate these two concerns by using the Java DSL method “filter”. But it seems the splitter is unable to detect when to finish the split process if you dropped some pieces with the filter, which we do. As a workaround for this, do all the steps together in the aggregation strategy. Note that the XPath expression is applied implicitly to the message body.
The “SplitterAggregationStrategy” class looks like this:
public class SplitterAggregationStrategy implements AggregationStrategy {
private final String fFilter;
public SplitterAggregationStrategy(String filter) {
fFilter = filter;
}
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
List<String> result = null == oldExchange ? result = new ArrayList<String>() : oldExchange.getIn().getBody(List.class);
Node node = newExchange.getIn().getBody(Node.class);
String content = node.getNodeValue().trim();
if (content.matches(fFilter)) {
result.add(content.trim());
}
newExchange.getIn().setBody(result);
return newExchange;
}
}
To use this route we need to modify our main method of the “Part1″ class in the following way:
public static void main(String[] args) throws Exception {
final DefaultCamelContext camelContext = new DefaultCamelContext();
camelContext.addRoutes(new PageExtractorRoutes());
camelContext.start();
ProducerTemplate template = camelContext.createProducerTemplate();
String result = template.requestBody("direct:page_extractor", "http://www.w3.org", String.class);
System.out.printf("Extracted: '%s'.\n", result);
Thread.sleep(10000);
camelContext.stop();
}
Before starting the context, we now apply our route builder to it to get the route added. To send a URL to our extractor and get the result, we need to create a “ProducerTemplate” and invoke one of the many available methods on it. In our case, we send only a value using the body, do a synchronous call and get a string result that is be printed to the console. If you inspect the methods in the “ProducerTemplate” class, you notice many different variations available. These correspond to the variations of the before mentioned call characteristics.
The last detail to notice in the route is that we added some logging code using the Java DSL “log” method.
Last but not least, we need to complete the dependencies in our Maven pom file like this:
...
<dependencies>
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-core</artifactId>
<version>${camel.version}</version>
</dependency>
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-http</artifactId>
<version>${camel.version}</version>
</dependency>
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-tagsoup</artifactId>
<version>${camel.version}</version>
</dependency>
</dependencies>
...
To make this post short enough, we will end here and continue in a second part of a series. In the next part, we will introduce the following topics and features:
- Using error handling to control HTTP redirections and Java Beans for fine-grained logic.
- Exposing our extractor functionality as a Web service.
If you enjoyed the reading, I hope to see you in the second part.
Note: if you are interested in learning more Camel Enterprise Integration Patterns, maybe you should have a look at DZone Refcard: The Top Twelve Integration Patterns for Apache Camel.
Update: added source code for download. To run the application, just expand the file, change to the folder where the pom file is and execute: ‘mvn compile exec:java -Dexec.mainClass=”com.canoo.camel.Part1″‘











Claus Ibsen said,
March 15, 2011 @ 1:41 pm
Hi
Nice blog. Liking the LEGO analogy, as I was born in Denmark and played with LEGO as a child
I took the liberty of adding a link to this blog from the Camel articles page (our link collection)
http://camel.apache.org/articles
Takes a couple of hours for the website to sync and be updated.
/Claus Ibsen
Camel committer
Rich Internet Applications (RIA) » Blog Archive » LEGO® Java (II): Apache Camel Error Handling, Java Beans and Web Services said,
March 16, 2011 @ 12:27 pm
[...] first part of this series showed you how to start a Camel context, write a simple route, and then stop the [...]
Rich Internet Applications (RIA) » Blog Archive » LEGO® Java (III): Apache Camel Routing and Testing said,
March 21, 2011 @ 2:29 pm
[...] the first part of this series we saw the basics of Apache Camel routing and contexts, and in the second part we [...]
Rich Internet Applications (RIA) » Blog Archive » LEGO® Java (IV): Apache Camel, Spring and ElasticSearch said,
March 24, 2011 @ 12:38 pm
[...] yet, this is the fourth part of a series about Apache Camel. The previous posts can be found here: first part, second part and third part. In this fourth article, we are going to show how to properly use the [...]