Open Source Business Intelligence has come of with age and has evolved with state of the art technologies confronting each other leaving end user excited between what to choose. As each technology has its own pros and cons, so is the expectation. In this article we would be paying a visit to one of the most widely used Open Source Business Intelligence Reporting Technologies.
Introducing Jasper ReportsJasper Report is the world's most widely used embedded Java reporting library. It provides dramatically accelerated report development, support for web, print-ready production reports, high-performance, and massive scalability.
Why Jasper Reports?After scrutinizing quite a few Reporting Technologies, I chose to explore Jasper as I saw some benefits on using it. I see that it can support our application on a broader perspective. Let us examine some of the problems we would need to be contented when we use third party reporting tools.
Motive 1
- First and Foremost, third party tools which have those fancied design and layouts would be packaged as multiple libraries and plug gins. This will in turn increase the size of our application to a 10 fold to 50MB which we may not prefer. For instance, the popular BIRT reporting comes up with lot of features in the form of plug gins, only to use very few of them in our application. Jasper Report is only a single jar which is just 1.2 MB of space and can be easily embedded in any java application as we do in the case of other frameworks like Spring, Struts, Hibernate etc.
Motive 2
- Reports always deal with data sets. Data Sets can be of any source. It can be Excel, oracle database, flat file etc. Let us illustrate with an example. We need to get data from Oracle. Mostly all third party reporting tools including BIRT and Jasper have a wizard kind of an interface where the user will only have to give the connection information. How ever on our application’s perspective I thought that how it will be if we can get those connection information from our application itself. Here is where the Jasper Report scores over others. Let us take our application as an example. We use hibernate as a data layer which gets the connection information from its session factory. Jasper Reports provides a provision where the user can get the connection information from our applications hibernate session factory. We can do a DAO call, get the data as a list and set the data source needed for our Report. Our Reports also will be more controllable as everything is within the vicinity of our application. BIRT reporting does provide support for Hibernate through scripted Data Source. After exercising a research in this, what I found out was BIRT reporting makes or creates a new connection through a scripted data source. Jasper can use the existing session factory from our application itself. One other stand-out point is that on BIRT reporting, hibernate java code is written in the script editor of the BIRT designer which again eludes the reports from the application making it less controllable going forward towards the future. BIRT doesn't use JNDI. Instead, it uses ODA. The data source will need to be defined inside the report definition (.rptspec).
Motive 3
- Jasper Report can be very well integrated with Spring and QUARTZ. QUARTZ is an open source scheduler frame work from Open Symphony. Jasper Reports integrates well with QUARTZ as well as we can run our reports as a cron job. BASEL also uses QUARTZ scheduler in BEIC. Probably we can exercise the same if need be in LEDR as well.
Architecture
As shown in the above figure JasperReports architecture is based on declarative XML files which by convention have an extension of jrxml that contains the report layout. A lot of third-party design tools were produced to generate our jrxml file in a smooth way (like iReport or JasperAssistant) Design file is supposed to be filled by report's result which is fetched from database, XML files, Java collection, Comma-separated values or Models. Jasper can communicate with those data-sources and more, it can merge any number of data-sources together and manipulates the results of any combinations. This communication goes through JDBC, JNDI, XQuery, EJBQL, Hibernate or existing Oracle PL/SQL. We also can define our own data-source class and pass it to jasper engine directly. After defining our report design layout in jrxml format and determining our data source(s) jasper engine does the rest of work. It compiles our design file and fills it with results fetched from data-source and generates our report to the chosen exporting format (PDF, Excel, HTML, XML, RTF, TXT …, etc.)
Report Definition file structure (jrxml):
- Jasper design file –jrxml- contains the following elements:
- <jasperReport>: the root element.
- <title>: its contents are printed only once at the beginning of the report
- <pageHeader> - its contents are printed at the beginning of every page in the report.
- <detail> - contains the body of the report, repeated by n number of results
- <pageFooter> - its contents are printed at the bottom of every page in the report.
- <band> - defines a report section, all of the above elements contain a band element as its only child element.
Only the root element is mandatory, the rest of elements are optional.
EnvironmentTo set up working environment we need to download JasperReport jar file from the following URL: http://sourceforge.net/project/showfiles.php?group_id=36382&package_id=28579
and add the following jars to our project classpath:
- jasperreports-2.0.4.jar
- commons-digester-1.7.jar [This jar is already under our workspace]
- commons-collections-2.1.jar (commons-collections.jar) [This jar is already under our workspace]
- commons-logging-1.0.2.jar [This jar is already under our workspace]
- commons-beanutils.jar [This jar is already under our workspace]
- iText-2.0.7.jar (used infor PDF exporting) [This jar is already under our workspace].
Jasper Reports with Large Data Sets
There are certain things to care while implementing the Reports for huge dataset to handle the memory efficiently, so that the application does not go out of memory.
They are:
Pagination of the data and use of JRDataSource. Pagination is implemented in almost all reporting technologies. How ever we need to examine on how the data is pulled from the data set. Jasper Report has a unique way of handling Pagination with use of JRDataSource and Virtualization.
Virtualization of the report.
When there is a huge dataset, it is not a good idea to retrieve all the data at one time. The application will hog up the memory and our application will go out of memory even before coming to the jasper report engine to fill up the data. To avoid that, the service layer/Db layer should return the data in pages and we gather the data in chunks and return the records in the chunks using JRDataSource interface, when the records are over in the current chunk, get the next chunk until all the chunks gets over. We should not go for the Collection data sources, we should implement the JRDataSource interface and provide the data through next() and getFieldValue(). To provide an example, I just took the “virtualizer” example from the jasper Reports sample and modified a bit to demonstrate for this article.
Even after returning the data in chunks, finally the report has to be a single file. Jasper engine builds the JasperPrint object for this. To avoid the piling up of memory at this stage, Jasper Reports provided a really cool feature called Virtualizer. Virtualizer basically serializes and writes the pages into file system to avoid the out of memory condition. There are 3 types of Virtualizer out there as of now. They are JRFileVirtualizer, JRSwapFileVirtualizer, and JRGzipVirtualizer. JRFileVirtualizer is a really simple virtualizer, where we need to mention the number of pages to keep in memory and the directory in which the Jasper Engine can swap the excess pages into files. Disadvantage with this Virtualizer is file handling overhead. This Virtualizer creates so many files during the process of virtualization and finally produces the required report file from those files. If the dataset is not that large, then we can go far JRFileVirtualizer. The second Virtualizer is JRSwapFileVirtualizer, which overcomes the disadvantage of JRFileVirtualizer. JRSwapFileVirtualizer creates only one swap file, which can be extended based on the size we specify. We have to specify the directory to swap, initial file size in number of blocks and the extension size for the JRSwapFile. Then while creating the JRSwapFileVirtualizer, provide the JRSwapFile as a parameter, and the number of pages to keep in memory. This Virtualizer is the best fit for the huge dataset. The Third Virtualizer is a special virtualizer which does not write the data into files; instead it compresses the jasper print object using the Gzip algorithm and reduces the memory consumption in the heap memory. The Ultimate Guide of Jasper Reports says that JRGzipVirtualizer can reduce the memory consumption by 1/10th. If our dataset is not that big for sure and if we want to avoid the file I/O, we can go for JRGzipVirtualizer.
Please find the sample (VirtualApp sourced from Jasper Soft) to know more about the coding part. The below results are also sourced from Jasper Soft.
1a) No Virtualizer, which ended up in out of memory with 10MB max heap size limit.
export:
[java] Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
[java] Java Result: 1
1b) No Virtualizer with default heap size limit (64M)
export2:
[java] null
[java] Filling time : 44547
[java] PDF creation time : 22109
[java] XML creation time : 10157
[java] HTML creation time : 12281
[java] CSV creation time : 2078
2) 2) With JRFileVirtualizer
exportFV:
[java] Filling time : 161170
[java] PDF creation time : 38355
[java] XML creation time : 14483
[java] HTML creation time : 17935
[java] CSV creation time : 5812
3) With JRSwapFileVirtualizer
exportSFV:
[java] Filling time : 51879
[java] PDF creation time : 32501
[java] XML creation time : 14405
[java] HTML creation time : 16579
[java] CSV creation time : 5365
4a) With GZipVirtualizer with lots of GC
exportGZV:
[java] Filling time : 84062
[java] Exception in thread “RMI TCP Connection(22)-127.0.0.1? java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “RMI TCP Connection(24)-127.0.0.1? java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “RMI TCP Connection(25)-127.0.0.1? java.lang.OutOfMemoryError: Java heap space
[java] Exception in thread “RMI TCP Connection(27)-127.0.0.1? java.lang.OutOfMemoryError: Java heap space
[java] Java Result: 1
4b) With GZipVirtualizer (max: 13MB)
exportGZV2:
[java] Filling time : 59297
[java] PDF creation time : 35594
[java] XML creation time : 16969
[java] HTML creation time : 19468
[java] CSV creation time : 10313
7. Code Snippets
1) VirtualizerApp: This is sourced from Jasper Soft.
2) Let us implement our own Virtualizer example in a web application
a) Strut Action:
MytableDAO dao = new MytableDAO();
List results = dao.findAll();
Map parameters = new HashMap();
parameters.put("Title", "Jasper Report");
InputStream reportStream = getServletContext().getResourceAsStream("/reports/one.jasper");
JasperReport jasperReport = (JasperReport) JRLoader.loadObject(reportStream);
String[] fields = new String[]
{
"id", "name"
}
;
JRAbstractLRUVirtualizer virtualizer = null;
virtualizer = new JRGzipVirtualizer(2);
parameters.put(JRParameter.REPORT_VIRTUALIZER, virtualizer);
JasperPrint print = JasperFillManager.fillReport(jasperReport, parameters, new HibernateDataSource(results, fields));
if (print != null)
{
//out put
response.setContentType("application/pdf");
ServletOutputStream ouputStream = response.getOutputStream();
JasperExportManager.exportReportToPdfStream(print, ouputStream);
// manually cleaning up
if (virtualizer != null)
{
virtualizer.cleanup();
}
ouputStream.flush();
ouputStream.close();
}
else
{
response.setContentType("text/html");
}
public List findAll()
{
log.debug("finding all Mytable instances");
try
{
String queryString = "select id, name from Mytable"
Query queryObject = getSession().createQuery(queryString);
return queryObject.list();
}
catch (RuntimeException re)
{
log.error("find all failed", re);
throw re;
}
}
/*
* ============================================================================
* GNU Lesser General Public License
* ============================================================================
*
* JasperReports - Free Java report-generating library.
* Copyright (C) 2001-2006 JasperSoft Corporation http://www.jaspersoft.com
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2.1 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, write to the Free Software
* Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
*
* JasperSoft Corporation
* 303 Second Street, Suite 450 North
* San Francisco, CA 94107
* http://www.jaspersoft.com
*/
import java.util.HashMap;
import java.util.Map;
import java.io.File;
import net.sf.jasperreports.engine.JRField;
import net.sf.jasperreports.engine.JRDataSource;
import net.sf.jasperreports.engine.JREmptyDataSource;
import net.sf.jasperreports.engine.JRException;
import net.sf.jasperreports.engine.JRExporterParameter;
import net.sf.jasperreports.engine.JRParameter;
import net.sf.jasperreports.engine.JasperExportManager;
import net.sf.jasperreports.engine.JasperFillManager;
import net.sf.jasperreports.engine.JasperPrint;
import net.sf.jasperreports.engine.JasperPrintManager;
import net.sf.jasperreports.engine.export.JRCsvExporter;
import net.sf.jasperreports.engine.JRVirtualizer;
import net.sf.jasperreports.engine.fill.JRFileVirtualizer;
import net.sf.jasperreports.view.JasperViewer;
import net.sf.jasperreports.engine.fill.JRAbstractLRUVirtualizer;
import net.sf.jasperreports.engine.fill.JRSwapFileVirtualizer;
import net.sf.jasperreports.engine.fill.JRGzipVirtualizer;
import net.sf.jasperreports.engine.util.JRProperties;
import net.sf.jasperreports.engine.util.JRSwapFile;
/**
* @author Teodor Danciu (teodord@users.sourceforge.net)
* @version $Id: VirtualizerApp.java,v 1.8 2006/04/19 10:26:14 teodord Exp $
*/
public class VirtualizerApp
{
/**
*
*/
private static final String TASK_PRINT = "print";
private static final String TASK_PDF = "pdf";
private static final String TASK_XML = "xml";
private static final String TASK_XML_EMBED = "xmlEmbed";
private static final String TASK_HTML = "html";
private static final String TASK_CSV = "csv";
private static final String TASK_VIEW = "view";
private static final String TASK_EXPORT = "export";
private static final String VITUALIZER_FILE = "file";
private static final String VITUALIZER_SWAP_FILE = "swapFile";
private static final String VITUALIZER_GZIP = "gZip";
/**
*
*/
public static void main(String[] args)
{
String fileName = null;
String outFileName = null;
String taskName = null;
String virtualizerType = null;
if (args.length == 0)
{
usage();
return;
}
int k = 0;
while (args.length > k)
{
if (args[k].startsWith("-T"))
taskName = args[k].substring(2);
else if (args[k].startsWith("-V"))
virtualizerType = args[k].substring(2);
else if (args[k].startsWith("-F"))
fileName = args[k].substring(2);
else if (args[k].startsWith("-O"))
outFileName = args[k].substring(2);
k++;
}
try
{
// Virtualization works only with in memory JasperPrint objects.
// All the operations will first fill the report and then export
// the filled object.
// creating the data source
//JRDataSource ds = new JREmptyDataSource(1000);
InnerDS ds = new InnerDS(200, 1000);
System.out.println(virtualizerType);
JasperPrint jasperPrint = null;
Map parameters = new HashMap();
JRAbstractLRUVirtualizer virtualizer = null;
if (VITUALIZER_FILE.equals(virtualizerType))
{
// creating the virtualizer
virtualizer = new JRFileVirtualizer(2, "tmp");
parameters.put(JRParameter.REPORT_VIRTUALIZER, virtualizer);
}
else if (VITUALIZER_SWAP_FILE.equals(virtualizerType))
{
// creating the virtualizer
JRSwapFile swapFile = new JRSwapFile("tmp", 1024, 1024);
virtualizer = new JRSwapFileVirtualizer(2, swapFile, true);
parameters.put(JRParameter.REPORT_VIRTUALIZER, virtualizer);
}
else if (VITUALIZER_GZIP.equals(virtualizerType))
{
// creating the virtualizer
virtualizer = new JRGzipVirtualizer(2);
parameters.put(JRParameter.REPORT_VIRTUALIZER, virtualizer);
}
// filling the report
jasperPrint = fillReport(fileName, ds, parameters);
if (virtualizer != null)
{
virtualizer.setReadOnly(true);
}
if (TASK_PRINT.equals(taskName))
{
JasperPrintManager.printReport(jasperPrint, true);
}
else if (TASK_PDF.equals(taskName))
{
exportPDF(outFileName, jasperPrint);
}
else if (TASK_XML.equals(taskName))
{
exportXML(outFileName, jasperPrint, false);
}
else if (TASK_XML_EMBED.equals(taskName))
{
exportXML(outFileName, jasperPrint, true);
}
else if (TASK_HTML.equals(taskName))
{
exportHTML(outFileName, jasperPrint);
}
else if (TASK_CSV.equals(taskName))
{
exportCSV(outFileName, jasperPrint);
}
else if (TASK_EXPORT.equals(taskName))
{
exportPDF(outFileName + ".pdf", jasperPrint);
exportXML(outFileName + ".jrpxml", jasperPrint, false);
exportHTML(outFileName + ".html", jasperPrint);
exportCSV(outFileName + ".csv", jasperPrint);
// manually cleaning up
if (virtualizer != null)
{
virtualizer.cleanup();
}
}
else if (TASK_VIEW.equals(taskName))
{
JasperViewer.viewReport(jasperPrint, true);
}
else
{
usage();
System.exit(0);
}
Thread.sleep(999999999);
}
catch (JRException e)
{
e.printStackTrace();
System.exit(1);
}
catch (Exception e)
{
e.printStackTrace();
System.exit(1);
}
}
private static void exportCSV(String outFileName, JasperPrint jasperPrint) throws JRException
{
long start = System.currentTimeMillis();
JRCsvExporter exporter = new JRCsvExporter();
exporter.setParameter(JRExporterParameter.JASPER_PRINT, jasperPrint);
exporter.setParameter(JRExporterParameter.OUTPUT_FILE_NAME, outFileName);
exporter.exportReport();
System.err.println("CSV creation time : " + (System.currentTimeMillis() - start));
}
private static void exportHTML(String outFileName, JasperPrint jasperPrint) throws JRException
{
long start = System.currentTimeMillis();
JasperExportManager.exportReportToHtmlFile(jasperPrint, outFileName);
System.err.println("HTML creation time : " + (System.currentTimeMillis() - start));
}
private static void exportXML(String outFileName, JasperPrint jasperPrint, boolean embedded) throws JRException
{
long start = System.currentTimeMillis();
JasperExportManager.exportReportToXmlFile(jasperPrint, outFileName, embedded);
System.err.println("XML creation time : " + (System.currentTimeMillis() - start));
}
private static void exportPDF(String outFileName, JasperPrint jasperPrint) throws JRException
{
long start = System.currentTimeMillis();
JasperExportManager.exportReportToPdfFile(jasperPrint, outFileName);
System.err.println("PDF creation time : " + (System.currentTimeMillis() - start));
}
private static JasperPrint fillReport(String fileName, JRDataSource dataSource, Map parameters) throws JRException
{
long start = System.currentTimeMillis();
JasperPrint jasperPrint = JasperFillManager.fillReport(fileName, parameters, dataSource);
System.err.println("Filling time : " + (System.currentTimeMillis() - start));
return jasperPrint;
}
/**
*
*/
private static void usage()
{
System.out.println("VirtualizerApp usage:");
System.out.println("\tjava VirtualizerApp -Ttask -Ffile");
System.out.println("\tTasks : print | pdf | xml | xmlEmbed | html | csv | export | view");
}
static class InnerDS implements JRDataSource
{
int pageSize = 200;
int numberOfPages = 1000;
int totalPages = 1000;
int curIndex = 0;
public InnerDS()
{
}
public InnerDS(int pageSize, int numberOfPages)
{
this.pageSize = pageSize;
this.numberOfPages = numberOfPages;
this.totalPages = numberOfPages;
}
public boolean next() throws JRException
{
curIndex--;
//System.out.println("CurIndex : " + curIndex + " No Of pages: " + numberOfPages);
if (curIndex <= 0)
{
if (numberOfPages > 0)
{
numberOfPages--;
curIndex = pageSize;
}
}
return curIndex > 0;
}
public Object getFieldValue(JRField field) throws JRException
{
String retValue = "Page " + (totalPages - numberOfPages) + " record " + (pageSize -curIndex);
return retValue;
}
}
}
3 comments:
how is this a "comparative" study, when only Jasper is being talked about?
Post a Comment