OLAP String Storage and its Usage in Jedox

This article outlines the specifics of the current string storage in Jedox OLAP, as well as its limitations and recommended best practices.

String storage in Jedox OLAP

Design and implementation

String data in Jedox OLAP cubes is stored as text, allowing various types of non-numeric information such as rights, descriptions, and other text data to be included as content in a cube cell.

This storage method is designed to efficiently handle text data, ensuring that it can be quickly retrieved and displayed when needed.

Typical use cases
  • User rights: storing rights related access data: D, W, R, N.
  • Project related definitions: Integrator projects, Views definitions, etc.
  • Customer information: managing customer names, addresses, and other descriptive data.
  • Categorical data: handling categories, labels, and other non-numeric classifications that are essential for data analysis and reporting.

Comparison to numeric storage

String storage

String storage is designed for storing and retrieving various sizes of text data and is optimized for handling non-numeric information. However, it may require more memory and processing power for text searches and manipulations. Due to the handling of different sized string data, the string storage can fragment while being used (addition, deletion, updates). When this fragmentation reaches the defined limits, it must be compressed again.

Numeric storage

Numeric storage is optimized for calculations and aggregations by storing fixed-sized data in a format that allows for fast arithmetic operations and summations.

Guidelines for string data

Optimal data volumes

It is recommended to keep string data volumes manageable to ensure fast retrieval and processing.

It is recommended to keep the number of strings for a cube in the thousands. If the number of strings reaches into the millions due to extensive updating, it may be necessary to rebuild the string storage more often to keep the total volume acceptable. As the size of the string storage increases, the defragmentation operation takes up more time and resources.

For specific use cases, the configuration option string-storage-rebuild-limit can be modified to increase the triggering limit for achieving an ideal ratio between fragmentation and performance based on the requirements of the project. It is recommended to test scenarios in which the non-default settings are used. If large amounts of string data are frequently updated and the string-storage-rebuild-limit needs to be set to a high value, use the option string-storage-rebuild-schedule. This allows the rebuild to be scheduled at a specific time to avoid slowdowns during periods of high workload, etc.

Use relevant configuration settings and optimization techniques to improve the size and performance of the string storage in OLAP cubes.

Updated November 21, 2024