If you don’t know it yet, this is the fourth part of a series about Apache Camel. The previous posts can be found here: first part, second part and third part.
In this fourth article, we are going to show how to properly use the Spring framework in the sample application, while adding some extra functionality to it.
Extracting RSS feeds
If you remember what we have done so far, the sample application is capable of, given the URL of a web page, extract the text content of the body paragraphs. To make our application a little more powerful, let’s add some functionality to extract the contents of web pages listed in RSS feeds. To accomplish this, add a new class called “RssExtractorRoutes” with the following content:
public class RssExtractorRoutes extends RouteBuilder {
@Override
public void configure() throws Exception {
from("rss:http://feeds.bbci.co.uk/news/rss.xml?splitEntries=false")
.marshal().rss()
.marshal().string()
.split(xpath("//item/link/text()"))
.log("Link: '${body}'")
.to(PAGE_EXTRACTOR_EP);
}
}
This simple route uses the Camel
“rss” endpoint to access the RSS feed from BBC news, transforms the content with the rss “marshaller” and then extracts the article links by mean of XPath. As last step, it routes the extracted links to our already existing “page extractor” route.
Because we have added a new component, we need to add a new dependency to our Maven pom file:
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-rss</artifactId>
<version>${camel.version}</version>
</dependency>
We need to apply this new route builder to the camel context and we will also eliminate the code used to send a URL to the extractor route. For this, adjust the code of the main class like this:
public class Part4 {
public static void main(String[] args) throws Exception {
DefaultCamelContext camelContext = new DefaultCamelContext();
camelContext.addRoutes(new PageExtractorRoutes());
camelContext.addRoutes(new HtmlImproverRoutes());
camelContext.addRoutes(new RssExtractorRoutes());
camelContext.start();
Thread.sleep(100000);
camelContext.stop();
}
}
If you run the application, you should see in the log entries that all the articles present in the feed are routed to the extractor and afterward, the extracted paragraphs get lost in the space.
Let’s do something useful with them and store them in a search index. For that, we could use the Apache Lucene
“endpoint” that Camel already offers, but instead of this, let’s use an
Elastic Search node and explore how to configure and manage the life cycle of an Elastic Search instance by mean of Spring.
We need a new Elastic Search “bean” class with this content:
public class ElasticSearchBean {
public static final int DEFAULT_MAX_RESULTS = 50;
public static final String INDEX_METHOD = "index";
public static final String SEARCH_METHOD = "search";
public static final String ID_HEADER = "search_id";
private final Client fClient;
private final Node fNode;
private final String fIndex;
private final String fType;
private final String fField;
private int fMaxResults = DEFAULT_MAX_RESULTS;
public ElasticSearchBean(String index, String type, String field) {
fIndex = index;
fType = type;
fField = field;
fNode = nodeBuilder().local(true).node();
fClient = fNode.client();
}
public String getIndex() {
return fIndex;
}
public String getType() {
return fType;
}
public String getField() {
return fField;
}
public int getMaxResults() {
return fMaxResults;
}
public void setMaxResults(int maxResults) {
fMaxResults = maxResults;
}
public List<String> search(String query) {
SearchResponse searchResponse = search(
getField(), query,
getMaxResults(), getIndex()
);
List<String> results = new ArrayList<String>();
for (SearchHit hit : searchResponse.getHits()) {
results.add(hit.id());
}
return results;
}
public void index(Exchange exchange) {
Message in = exchange.getIn();
String id = in.getHeader(ID_HEADER, String.class);
String content = in.getBody(String.class);
try {
index(getIndex(), getType(), id, getField(), content);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
public void close() {
fNode.close();
fClient.close();
}
private IndexResponse index(String index, String type, String id, String fieldName, String fieldValue) throws IOException {
XContentBuilder item = jsonBuilder()
.startObject()
.field(fieldName, fieldValue)
.endObject();
return fClient.prepareIndex(index, type, id)
.setSource(item)
.execute()
.actionGet();
}
private SearchResponse search(String fieldName, String query, int maxResults, String... indexes) {
return fClient.prepareSearch(indexes)
.setSearchType(SearchType.DEFAULT)
.setQuery(termQuery(fieldName, query))
.setFrom(0).setSize(maxResults).setExplain(true)
.execute()
.actionGet();
}
}
This class is a little bit more complicated than what we have done until now, but what it basically does is creating an elastic search “local” node, offering a method to index a content under a certain structure and another method to search for a text within the previously indexed contents. Some aspects of the bean are also configurable: the name of the index, the type of the content, the index field where the content will be stored and the number of returned search results.
To use it in the RSS extractor, adjust the code like this:
public class RssExtractorRoutes extends RouteBuilder {
private static final String ARTICLES_INDEX = "articles";
private static final String ARTICLE_CONTENT_FIELD = "content";
private static final String ARTICLE_TYPE = "article";
private static final ElasticSearchBean ELASTIC_SEARCH_BEAN = new ElasticSearchBean(ARTICLES_INDEX, ARTICLE_TYPE, ARTICLE_CONTENT_FIELD);
@Override
public void configure() throws Exception {
from("rss:http://feeds.bbci.co.uk/news/rss.xml?splitEntries=false")
.marshal().rss()
.marshal().string()
.split(xpath("//item/link/text()"))
.setHeader(ID_HEADER, body())
.log("Link: '${body}'")
.to(PAGE_EXTRACTOR_EP)
.bean(ELASTIC_SEARCH_BEAN, INDEX_METHOD);
}
}
And finally, add the Eleasctic Search dependency to the Maven pom file:
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>0.15.2</version>
</dependency>
If you paid attention to the changes in the RSS extractor route builder, you should have noticed that now we have created an static instance and configured it in an static way. This is not completely bad but it has some disadvantages and one problem:
- If we would like to test this route, we cannot supply a different implementation of the bean.
- In case that we want to change the name of the index, the name of the field or any other configurable value, we have to do it in the code and recompile the application.
- When the bean is created, the Elastic Serach node is correctly initialized but, when is the “close” method called to shutdown the instance properly? This is, of course, the problem I mentioned.
To correct these issues, we can take advantage of the Spring framework and the excellent Spring support that Camel offers.
First of all, let’s include the Spring dependency in our Maven pom file:
<dependency>
<groupId>org.apache.camel</groupId>
<artifactId>camel-spring</artifactId>
<version>${camel.version}</version>
</dependency>
Now, let’s create a file named “camel-context.xml” under “resources/META-INF/spring” with this content:
<beans xmlns="http://www.springframework.org/schema/beans" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation=" http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans-3.0.xsd
http://camel.apache.org/schema/spring http://camel.apache.org/schema/spring/camel-spring.xsd">
<camelContext xmlns="http://camel.apache.org/schema/spring">
<packageScan>
<package>com.canoo.camel</package>
</packageScan>
</camelContext>
</beans>
And as last step, adpat the “main” class like this:
import org.apache.camel.spring.Main;
public class Part4 {
public static void main(String[] args) throws Exception {
Main main = new Main();
main.enableHangupSupport();
main.start();
}
}
If you run the application now, the only difference that you should notice is that the application does not stop after 100 seconds.
By using the “Main” class that the Camel Spring support offers, we are starting the Spring container and looking for Spring configuration files that will be automatically loaded and used to configure the Spring container. Such files should have the extension “.xml” and be placed under the “META-INF/spring” package.
The “enableHangupSupport” call instructs the application to listen to “ctrl-c” key strokes and, before terminating, to stop properly the Spring container.
Now that we have configured Spring, let’s move the Elastic Search bean into the Spring beans file (“camel-context.xml”):
<bean class="com.canoo.camel.beans.ElasticSearchBean" id="elasticSearchBean" scope="singleton" destroy-method="close">
<constructor-arg value="articles"/>
<constructor-arg value="article"/>
<constructor-arg value="content"/>
</bean>
And adjust the Rss extractor route like this:
@Override
public void configure() throws Exception {
from("rss:http://feeds.bbci.co.uk/news/rss.xml?splitEntries=false")
.marshal().rss()
.marshal().string()
.split(xpath("//item/link/text()"))
.setHeader(ID_HEADER, body())
.log("Link: '${body}'")
.to(PAGE_EXTRACTOR_EP)
.beanRef("elasticSearchBean", INDEX_METHOD);
}
What we have done with these changes is: letting Spring instantiate and configure an unique instance of our bean (a singleton) and instructing Spring to invoke the “close” method whenever the container is destroyed (what happens when the user presses “ctrl-c” on the shell).
Because now the bean will be searched by name in the Camel bean registry (the Spring container in our case), in order to substitute the bean implementation with another one, it would be enough to use a different spring configuration file or to overwrite the bean definition by loading a second beans file with a new definition of the same bean. This is a convenient thing if, for example, we want to use mocks for testing or want to have different search services.
To avoid having to edit the spring beans file in order to change the Elastic Search bean configuration values, we can use a spring “PropertyPlaceHolderConfigurer” and create a properties file in the classpath (we will store it within the application and under the “resources” directory):
<bean class="org.springframework.beans.factory.config.PropertyPlaceholderConfigurer">
<property name="location">
<value>elasticsearch.properties</value>
</property>
</bean>
<bean class="com.canoo.camel.beans.ElasticSearchBean" id="elasticSearchBean" scope="singleton"
destroy-method="close">
<constructor-arg value="${index_name}"/>
<constructor-arg value="${content_type}"/>
<constructor-arg value="${index_field}"/>
</bean>
With this configuration change, Spring searches in the top-level package for a properties file with the name “elasticsearch.properties” and resolves the “${}” values against it, giving us more flexibility to alter these values without affecting the application.
To end this article doing something useful with our indexed contents, let’s create a search web service that allows us executing a search query and displays a web page with the matching article links.
For this, create a new route builder:
public class SearchServiceRoutes extends RouteBuilder {
@Override
public void configure() throws Exception {
from("jetty://http://0.0.0.0:8080/search")
.setBody(header("query"))
.to("direct:search")
.bean(HtmlFormatterBean.class, AS_SEARCH_RESULTS_PAGE);
from("direct:search")
.log("Searching: '${body}'.")
.beanRef("elasticSearchBean", SEARCH_METHOD)
.bean(HtmlFormatterBean.class, AS_LINKS)
.log("Found results: '${body}'.");
}
}
Notice that now, it is not necessary to add explicitly the new route builders to the camel context due to the fact that Spring will instantiate the context and look for route builders under the package “com.canoo.camel” (as configured in “camel-context.xml”).
We need also to extend the functionality of the class “HtmlFormatterBean” as follows:
public class HtmlFormatterBean {
public static final String AS_EXTRACTED_RESULTS_PAGE = "asExtractedResultsPage";
public static final String AS_SEARCH_RESULTS_PAGE = "asSearchResultsPage";
public static final String AS_LINKS = "asLinks";
public String asExtractedResultsPage(List<String> contents) {
return asPage(contents, "Extracted contents:");
}
public String asSearchResultsPage(List<String> contents) {
return asPage(contents, "Search results:");
}
private String asPage(List<String> contents, String title) {
StringBuffer stringBuffer = new StringBuffer();
stringBuffer.append(String.format("<html><body><h1>%s</h1><ul>", title));
for (String content : contents) {
stringBuffer.append("<li>").append(content).append("</li>");
}
stringBuffer.append("</ul></body></html>");
return stringBuffer.toString();
}
public List<String> asLinks(List<String> urls) {
List<String> result = new ArrayList<String>();
for (String url : urls) {
result.add(String.format("<a href='%1$s'>%1$s</a>", url));
}
return result;
}
}
Please notice that, because now this bean has more than only one method, you will also need to specify the method to call in every bean reference within the application.
To test our new search service, start the application and point to the URL: http://localhost:8080/search?query=news. If some of the indexed articles contains the word “news”, you should see a list with their links.
The camel ride ends here. I hope that you enjoyed reading the articles as much as I enjoyed writing them. I also hope that it has helped you in getting to know how Camel works and that you can found some use cases where to apply this slightly different way of building integration applications.
The code of this fourth part is here, to execute the application just unzip the file, change to the directory where the pom file is and type ‘mvn compile exec:java -Dexec.mainClass=”com.canoo.camel.Part4″‘ in the console.
Hope to see you soon in another post!