Pentaho Analysis Services: Configuration Guide

Configuration Guide

Properties

Property list

Connect strings

Syntax
Connect string properties

Cache management
1. Schema cache
Memory management
1. Out of memory
Logging
1. Configuring log4j within Mondrian's test environment
2. MDX and SQL Statement Logging

1. Properties

Mondrian has a properties file to allow you to configure how it executes. The mondrian.properties file is loaded when the executing Mondrian JAR detects it needs properties, but can also be done explicitly in your code. It looks in several places, in the following order:

In the directory where you started your JVM (Current working directory for JVM process, java.exe on Win32, java on UNIX/Linux).
If there isn't mondrian.properties under current working directory of JVM process, Class MondrianProperties's classloader will try to locate mondrian.properties in all of its classpaths. So you may put mondrian.properties under /WEB-INF/classes when you pack Mondrian into a Java web application. The demonstration web applications have this configuration.

These properties are stored as system properties, so they can be set during JVM startup via -D<property>=<value>.

1.1 Property list

The following properties in mondrian.properties effect the operations of Mondrian.

Not all of the properties in this table are of interest to the end-user. For example, those in the 'Testing' are only applicable if are running Mondrian's suite of regression tests.

Limit properties

Properties mondrian.result.limit, mondrian.rolap.iterationLimit and mondrian.rolap.queryTimeout enforce runtime limits on the time or space required to execute a query. If any of these limits are exceeded, mondrian throws an exception which extends mondrian.olap.ResultLimitExceededException.

Connect strings

Connect string syntax

Mondrian connect strings are a connection of property/value pairs, of the form 'property=value;property=value;...'.

Values can be enclosed in single-quotes, which allows them to contain spaces and punctuation. See the the OLE DB connect string syntax specification.

The supported properties are described below.

Connect string properties

Name	Required?	Description
Provider	Yes	Must have the value "Mondrian".
Jdbc	Exactly one	The URL of the JDBC database where the data is stored. You must specify either `DataSource` or `Jdbc`.
DataSource	Exactly one	The name of a data source loaded via JNDI. The name must be a valid JNDI name, and the object referenced must implement the javax.sql.DataSource interface. You must specify either `DataSource` or `Jdbc`.
JdbcDrivers	Yes	Comma-separated list of JDBC driver classes, for example, `JdbcDrivers=sun.jdbc.odbc.JdbcOdbcDriver,oracle.jdbc.OracleDriver`
JdbcUser	No	The name of the user to log on to the JDBC database. (If your JDBC driver allows you to specify the user name in the JDBC URL, you don't need to set this property.)
JdbcPassword	No	The name of the password to log on to the JDBC database. (If your JDBC driver allows you to specify the password in the JDBC URL, you don't need to set this property.)
Catalog	Exactly one	The URL of the catalog, an XML file which describes the schema: cubes, hierarchies, and so forth. For example, `Catalog=file:demo/FoodMart.xml` Catalogs are described in the Schema Guide. See also `CatalogContent`.
CatalogContent	Exactly one	An XML string representing the schema: cubes, hierarchies, and so forth. For example, `CatalogContent=<Schema name="MySchema"><Cube name="Cube1"> ... </Schema>` Catalogs are described in the Schema Guide. See also `Catalog`.
CatalogName	No	Not used. If, in future, Mondrian supports multiple catalogs, this property will specify which catalog to use. See also `Catalog`.
PoolNeeded	No	Tells Mondrian whether to add a layer of connection pooling. If the value "true" is specified, or no value is specified, Mondrian assumes that: connections created via the `Jdbc` property are not pooled, and therefore need to be pooled; connections created via the `DataSource` are already pooled. If the value "false" is specified, Mondrian does not apply connection-pooling to any connection.
Role	No	The name of the role to adopt for access-control purposes. If not specified, the connection uses a role which has access to every object in the schema. This property can contain multiple role names separated by commas. If so, queries in the connection execute with the sum of the privileges of all of the rules; the effect is the same as running under a union role, defined using the `<Union>` element in the schema file. If a role name contains a comma, escape the comma using an extra comma. For example, a connection created with `Role='Pacific region manager,Europe,, Middle East and Africa manager'` will execute with the combined privileges of the roles "Pacific region manager", and "Europe, Middle East and Africa manager".
jdbc.*	No	Any property whose name begins with "jdbc." will be added to the JDBC connection properties, after removing this prefix. This allows you to specify connection properties without a URL. For example, given the properties `jdbc.Timeout=50; jdbc.CacheSize=1m` Mondrian will create a JDBC connection using the properties {Timeout="50", CacheSize="1m"}.
UseContentChecksum	No	Allows mondrian to work with dynamically changing schema. If this property is set to `true` and schema content has changed (previous checksum doesn't equal with current), schema would be reloaded. The default is `false`. Could be used in combination with `DynamicSchemaProcessor` property.
UseSchemaPool	No	Controls whether a new connection use a schema from the schema cache. If `true`, the default, a connection shares a schema definition (and hence also a cache of aggregate data retrieved by previous queries) with other connections which have a textually identical schema definition. If `false`, the connection has a private schema definition and cache.
DynamicSchemaProcessor	No	The name of a class which is called at runtime in order to modify the schema content. The class must implement the mondrian.spi.DynamicSchemaProcessor interface. For example, `DynamicSchemaProcessor = mondrian.i18n.LocalizingDynamicSchemaProcessor` uses the builtin schema processor class mondrian.i18n.LocalizingDynamicSchemaProcessor to replace variables in the schema file, according to resource files and the current locale (see the `Locale` property).
Locale	No	The requested Locale for the current session. The locale determines the formatting of numbers and date/time values, and Mondrian's error messages. Example values are "en" (English), "en_US" (United States English), "hu" (Hungarian). If Locale is not specified, then the name of system's default will be used, as per java.util.Locale#getDefault().
JdbcConnectionUuid	No	A unique identifier for the connection. If this is set, Mondrian will look at this property and no other to determine whether two data sources should be considered the same. You must ensure that connections will only share a JdbcConnectionUuid if they point to the same database.
AggregateScanCatalog	No	The name of the database catalog to scan when loading aggregate tables. If this is not set, Mondrian will read all catalogs the database connection has access to when loading aggregate tables.
AggregateScanSchema	No	The name of the database schema to scan when loading aggregate tables. If this is not set, Mondrian will read all schemas the database connection has access to when loading aggregate tables.

Connect string properties are also documented in the RolapConnectionProperties class.

Cache management

Schema cache

To flush all schema definitions, use the mondrian.olap.CacheControl.flushSchemaCache() method:

import mondrian.olap.*; Connection connection; CacheControl cacheControl = connection.getCacheControl(null); cacheControl.flushSchemaCache();

The cache is only used when creating new connections; existing connections retain their schemas.

There are four connect string properties that control the use of the Schema cache: UseSchemaPool, UseContentChecksum, CatalogContent and DynamicSchemaProcessor.

The UseSchemaPool property controls whether or not the cache is used regardless of the values of any of the other properties. If UseSchemaPool is "false", then the cache is not used; each request for a new schema object creates a new one (entailing the re-parsing of the schema definition and re-scanning of the database for meta data and aggregate tables - very slow, and, in addition, there is no reuse of the in-memory aggregate cache).

Next, if UseContentChecksum is "true", then a check sum (MD5) is created from the schema definition content (not URL) and it is this check sum that is used as the key to lookup previously cached versions of the schema definition. If two schema definitions produce different check sums, then one can safely assume that they are different schemas (of course, it is possible that only a comment or some whitespace in the schema definition changed in which case the two schemas would actually be the same, but because their check sums are different, different schema objects are used). If UseContentChecksum is "false", then no check sum is created and used as the lookup key, rather, a combination of the connection attributes "catalogUrl", "connectionKey", "jdbcUser", "dataSourceStr" or "catalogUrl", "dataSource" are used to create the key.

If the CatalogContent is specified, then it is used as the schema definition content. If, in fact, it is specified, then the value of DynamicSchemaProcessor, if any, is ignored.

Finally, the DynamicSchemaProcessor connection string property is the class name of a class that implements the DynamicSchemaProcessor interface. If set, an instance of the class is created for each schema request and its "processSchema" method is called which returns the schema definition content.

Memory management

Out Of Memory

Java OutOfMemoryError errors have always been an issue with applications. When the JVM throws an Error as opposed to an Exception it is telling the application that its world has ended and it has no recourse but to die. Prior to Java5 there was not much one could do other than buy 64-bit machines with lots of RAM and hope for the best. For a multi-user, Mondrian environment with potentially very large data-sets and clients that can generate queries requesting arbitrarily large amounts of that data, this can be an issue. This is especially the case when Mondrian is being hosted on some corporate web-server; applications that kill web-servers are not looked upon favorably by IT.

With Java5 (and Java6, etc.) there is alternative. An application cay take advantage of a new feature in Java5 allowing the application to be notified when memory starts running low. This allows the application to take preemptive action prior to an OutOfMemoryError being generated by the Java runtime.

Mondrian takes advantage of this new feature. Rather than passing an OutOfMemoryError to its client, it will now stop processing the present query, free up data structures associated with the present query and return a MemoryLimitExceededException to the client. The MemoryLimitExceededException is one of Mondrian's ResultLimitExceededException which are used to communicate with clients that a limit has been exceeded, in this case, memory usage.

By default, for Mondrian running under Java5, this feature is enabled and the "safety limit" is set at 90 percent, when memory usage gets to with 90 percent of the maximum possible, the the processing of the current query is stopped and a MemoryLimitExceededException is return to the client. See the Memory monitoring properties above on this page for additional information.

Lastly, the gorilla in the closet. Java5 in its wisdom only allows for one memory threshold notification level to be registered with the JVM. What this means is if within the same JVM, some code registers one level, say, at 80% (here I use percentages for ease of presentation rather than number of bytes which is what the Java5 API actually supports) and some other code later on registers a level of 90%, then it is the 90% that the JVM knows about - it knows nothing of the previously registered 80%. What this means is that the code expecting to be notified when the memory level crosses 80%, won't be notified!

For many applications that don't share their JVM with other applications, this is not a problem, but for Mondrian is it potentially an issue. Mondrian can be running in a Webserver and Webservers can have more than one independent applications. Each such application can register a different memory threshold notification level. In general, application-containing applications such as web-servers or application-servers are a problem with the current Java5 memory threshold notification approach. At the current time, I do not know a way around this problem.

Logging

Mondrian uses log4j for all information and debug logging. When running within an application server, Mondrian's log4j configuration is determined by the server's or web application's log4j configuration. Please see log4j's documentation for a additional details.

Configuring log4j within Mondrian's test environment

When running outside an application server, log4j determines the location of the log4j.xml file via the log4j.configuration java system property. log4j treats this string as a URL, so to have it detect the log4j file on the file system, you must use the syntax "file:DIR/log4j.xml". Relative paths are acceptible, so if you have your log4j.xml file in the root directory of mondrian, "file:log4j.xml" will load the correct file. You may specify the log4j.configuration property in mondrian.properties, because Mondrian's ant build file explicitly sets the property as a JVM system property when running JUnit tests.

MDX and SQL Statement Logging

The default log4j.xml file is configured so that a separate log file is created for both MDX and SQL statement logging. In the code, the MDX and SQL strings are logged at the debug level, so to disable them you can set the log level to INFO or any other level above debug. Statement logging occurs within the log4j categories "mondrian.mdx" and "mondrian.sql". These categories log the statements and how long they took to execute. The SQL log also records the number of results returned in the result set.

For example, to trace both MDX and SQL statements, create a file log4j.properties in the directory where you started mondrian with the following contents:

# Set root logger level to DEBUG and its only appender to MONDRIAN.
log4j.rootLogger=WARN, MONDRIAN

# MONDRIAN is set to be a ConsoleAppender.
log4j.appender.MONDRIAN=org.apache.log4j.ConsoleAppender

# MONDRIAN uses PatternLayout.
log4j.appender.MONDRIAN.layout=org.apache.log4j.PatternLayout
log4j.appender.MONDRIAN.layout.ConversionPattern=%-4r [%t] %-5p %c %x - %m%n

# Trace MDX and SQL statements
log4j.category.mondrian.mdx=DEBUG, MONDRIAN
log4j.category.mondrian.sql=DEBUG, MONDRIAN

Then mondrian with the argument -Dlog4j.configuration=file:log4j.properties on the Java command line.

Consider setting the property mondrian.rolap.generate.formatted.sql=true in mondrian.properties to make the format more readable.

Author: Julian Hyde; last modified April, 2011.
Version: $Id$ (log)
Copyright (C) 2006-2011 Pentaho

Contents