Rules Upgrade Troubleshooting

image_pdfimage_print

Jedox 7.0 was a major breakthrough in the Jedox OLAP line. It changed the behavior of some features to improve consistency and extended the set of scenarios it could handle. It is also more sensitive to errors in the rules. Below are the most common migration issues and their solutions.

Invalid cell references

Since version 7.0, Jedox OLAP distinguishes between zero and null. Before 7.0, a zero value was interpreted as null. Operations with null results mean nothing; DDE just ignores such values. It doesn’t allocate memory to store it and doesn’t use CPU to process them. It is the default value of any cube cell after the cube is created and before any value is loaded into it or any rule is defined.

Symptoms
  • Non-zero “error” counter rule calculation statistics
  • Warning message in palo.log: “Invalid source for rule xxx dimension nr. …”
  • Debug messages in palo.log “element xxx not found in dimension yyy” – verbose level has to be set to debug
  • GPU Advisor in Excel Add-in reports “rule error” for some of the cube rules
  • OLAP memory usage is high
  • Slow calculation, calculation stack not changing
Solution
  • Check consistency of attribute cubes. All cells used as element names should exist and have valid content. Temporarily define a rule for an attribute to validate content that checks for typos and completeness.
    For example, the following rule checks that all attribute values for ‘Company Code’ are valid element names in the Cost Center dimension.

    [‘Company Code’] = IF(ISERROR(PALO.EINDEX(“”,”Cost Center”,[])),CONCATENATE(“invalid value: “,[]),null)

     You can then browse the attribute table and fix all reported cells. Once finished, disable the rule.

  • Define a bumper rule for the end of rule recursion, or add IF in the recursive rule to handle termination of the recursion.

Use of (numeric) literals

Symptoms
  • Non-zero “zero result” counter in rule calculation statistics (if 0 is used instead of null)
  • Slow calculation
  • High OLAP memory usage
Solution
  • Search your exported rules for “0”, typically found in these sequences: ",0," or ",0)"
  • Replace 0 or "" with null wherever possible.
    For example, change
    [‘x’] = if ([‘x’]>0, [‘x’], 0)
    to
    [‘x’] = if ([‘x’]>null, [‘x’], null)
    or change

    [‘x’] = max([‘x’],0)
    to
    [‘x’] = max([‘x’], null)

  • If literals 0 or "" are used in equality comparisons as part of a condition within an IF statement, this comparison can be omitted completely. For example:
    [‘x’] = IF(PALO.DATA(“”,”#_Months”,”PrevMonth”,!’Months’,”~”) != “”, [‘y’], 0)
    or
    [‘x’] = IF(PALO.DATA(“”,”#_Months”,”PrevMonth”,!’Months’,”~”) != 0, [‘y’], 0)
    or
    [‘x’] = IF(PALO.DATA(“”,”#_Months”,”PrevMonth”,!’Months’,”~”) != null, [‘y’], 0)
    can be changed to:

    [‘x’] = if (PALO.DATA(“”,”#_Months”,”PrevMonth”,!’Months’,”~”), [‘y’], null)

Calculated zeros

Sometimes zeros are produced as secondary data, often in cases when a change of account in time was calculated from the opening and closing value. For non-null values, subtraction is evaluated as 0 (was null in version 6.0 and earlier).

Since version 7.0, OLAP is also strictly projecting error values of calculated rules into consolidations. For example, if cell [‘Jan’, ‘2018’, ‘ProductA’, ‘Actual’] is evaluated as an error, then all consolidated cells that depend on it are also errors. There can be billions of such cells, depending on the number of dimensions and their hierarchies. Previous OLAP versions made calculation faster, but users got no feedback about typos in the rules or inconsistencies in a calculation, such as when there is incomplete attribute data.

Typically, PALO.DATA uses parameter values from:

  • attribute content
  • hierarchy navigation functions, such as PALO.EPREV, PALO.EOFFSET, and PALO.ECHILD
  • name mapping, when there are two different time dimensions and an element is missing from one of them, e.g.  Plan_Period and Fact_Period dimensions, when the element 2018 from Plan_Period is missing in the Fact_period dimension
  • string manipulation – e.g. period “~” concatenated with “_YTD” can be an invalid name
Symptoms
  • Non-zero “zero result” counter in rule calculation statistics (if 0 is used instead of null)
  • Slow calculation
  • Possibly high memory usage
Solution
  • Replace numeric literal 0 with null
  • When calculating the difference from two nonzero values, use a non-equality test and return null if test is negative.
    For example, change
    [‘diff’] = [‘closing’] – [‘opening’]
    to
    [‘diff’] = if ([‘closing’] != [‘opening’], [‘closing’] – [‘opening’], null)

Fragmented dimensions

When Integrator is used repeatedly to update dimensions, and it deletes elements and creates them again, dimension elements can get fragmented. OLAP cannot hold its optimal structure to access these elements and spends more CPU cycles to find them. In some operations, the slowdown can be tens of percents.

Symptoms
  • Highest element ID in a dimension is much higher than lowest element ID (element IDs can be found by clicking on the dimension name in the OLAP server browser).
  • Slow calculation/dimension browsing.
Solution

The database can be compacted using the database script feature. In the Jedox Web Modeler, right-click on the database and select “Create Database Script”. Give the new script a temporary name and run the script created in the previous step within it. Rename the original database and use its name for the database created in the previous step. It reduced calculation times for several models by 30-50%.

PALO.DATA order in source and target cube

Symptoms
  • Red lines in OLAP admin cube analyzer graph, as in the example below:

  • Slow calculation
  • High counter in “Plan Nodes statistics” for nodes of type CELLMAP and TRANSFORMATION
Solution
  • Make sure that all cubes linked via PALO.DATA formulas use the same order of shared dimensions.

Multiple usage of the same expressions in rules

OLAP doesn’t yet optimally process expressions that are repeatedly used in one rule, e.g. [‘max’] = if ([‘a’] > [‘b’], [‘a’], [‘b’])

Symptoms
  • Slow calculation
Solution
  • Try to use aggregation function when possible.
    For example, change
    [‘x’] = IF([‘a’] > [‘b’], [‘a’], [‘b’])
    to
    [‘x’] = max([‘a’], [‘b’])
  • Move expressions repeated in IF outside of the IF.
    For example, change
    [‘x’] = IF([‘x’], [‘x’]*[‘y’], [‘y’])
    to
    [‘x’] = [‘y’] * IF([‘x’], [‘x’], 1)

Equality comparison

Model rules use comparison operators that include a cell equality test ( ==, >= and <=) with zero, an empty string, or another cell reference.

Symptoms
  • The high counter in “Plan Nodes statistics” for nodes of type EQ, LE, GE
  • Slow calculation
Solution
  • Switch 2nd and 3rd IF parameters and exchange comparison operator for one with the opposite meaning (== to !=, >= to <, <= to >)
    For example, change
    [‘x’] = if ([‘y’] == 0, [‘z’], -[‘z’])
    to
    [‘x’] = if ([‘y’], -[‘z’], [‘z’])

    or change
    [‘x’] = if (PALO.DATA(“”, “s”, “a”) >= PALO.DATA(“”, “s”, “b”), [‘a’], [‘b’])
    to
    [‘x’] = if (PALO.DATA(“”, “s”, “a”) < PALO.DATA(“”, “s”, “b”), [‘b’], [‘a’])

Insufficient cache size

In earlier versions of Jedox, the default cache size was too small – 1,000,000 cell values per cube. Version 7.1 has the default cache size set to 100,000,000 (non-null) cell values per cube.
The value is used for many temporary structures. If set too small, OLAP can use slower modes of algorithms.

Symptoms
  • Slow calculation
Solution
  • Make sure that palo.ini value cache barrier is at least 100,000,000.

Multi-core rule processing

CPU DDE tries to parallelize rules calculation. In many cases, rules are too complex for parallelization and can slow down calculation.

Symptoms
  • Slow calculation
Solution
  • Try to disable multicore rules calculation with the following line in the palo.ini:
    engine-configuration 1
  • If the engine configuration already exists in your setup, make sure the parameter contains 1. It can be added to the end of the parameter string.

Fix 5.1 rules with IF “optimization”

For OLAP 5.1, it was recommended to replace the following IF statement

IF (condition, true_expression, false_expression)

with

(condition) * (true_expression) + (1-condition) * (false_expression)

In 7.1, this approach becomes a specific case of the calculated zeros scenario described above. The problem is in the part (1-condition). If the condition is true (1), result of this is 0, and when multiplied with non-null (false_expression), it results in a calculated zero for potentially large number of cells.

Symptoms
  • Non-zero “Zero result” counter in rule calculation statistics (if 0 is used instead of null)
  • Slow calculation
  • Possibly high memory usage
Solution
  • Switch rule back to use IF statement. It is fully supported by 7.1 DDEs.

Deep recursions

OLAP should be able to process recursions deeper than some hundreds of iterations. Once the system runs out of resources, it returns the error ERROR_RULE_CALCULATION_TOO_DEEP.

Symptoms
  • Some rule time statistics are bigger than total engine rules calculation time (deeper recursion iterations are added multiple times)
  • Slow calculation
  • Sometimes high memory consumption
Solution
  • If possible, use consolidations instead of recursion, e.g. _YTD hierarchy in Month dimension.

DDE unsupported scenarios

DDE can handle most rule types. The following functions are natively supported by DDE in 7.1:

IF, ABS, ADD, AND, AVERAGE, CEILING, CONTINUE, DEL, DIV, EQ, EVEN, FACT, FLOOR, GE, GT, INT, ISERROR, LE, LN, LOG, LOG10, LT, MAX, MEDIAN, MIN, MOD, MUL, NE, NOT, ODD, OR, ROUND, SQRT, STET, SUM, TRUNC, PALO.DATA, PALO.MARKER

Functions without native DDE support in 7.1:

ACOS, ASIN, ATAN, CHAR, CLEAN, CODE, CONCATENATE, COS, COUNT, DATE, DATEFORMAT, DATEVALUE, EXACT, EXP, FIRST, ISNULL, LAST, LEFT, LEN, LOWER, MID, NOW, PI, POWER, PROPER, QUOTIENT, RAND, RANDBETWEEN, REPLACE, REPT, RIGHT, SEARCH, SIGN, SIN, STR, SUBSTITUTE, TAN, TRIM, UPPER, VALUE, VALUEDATE, WEEKDAY, PALO.ECHILD, PALO.ECHILDCOUNT, PALO.ECOUNT, PALO.EFIRST, PALO.EINDENT, PALO.EINDEX, PALO.EISCHILD, PALO.ELEVEL, PALO.ENAME, PALO.ENEXT, PALO.EOFFSET, PALO.EPARENT, PALO.EPARENTCOUNT, PALO.EPREV, PALO.ESIBLING, PALO.ETOPLEVEL, PALO.ETYPE, PALO.EWEIGHT, PALO.CUBEDIMENSION, PERCENTILE, ES, parameter value (!’DimensionName’)

The unsupported functions listed above can cause the engine to switch to the Structure/Marker Driven Engine – and potential slowdown – only if the following two conditions apply:

  • the function is NOT used within the 3rd or further PALO.DATA/PALO.MARKER parameter and when the parameter expression depends on one or fewer variables (!’dimension_name’)
  • the rule is being evaluated for more than 10.000 cells – typically B: rules

Rule examples :

[…] = PALO.DATA(“”,””,CONCATENATE(!’Month’,!’Year’))
Not DDE supported: function is used within the 3rd parameter but uses 2 variables – !’Month’ and !’Year’

[…] = PALO.DATA(“”,””,CONCATENATE(!’Month’,”_YTD”))
DDE supported: function used in 3rd parameter, uses only variable !’Month’

[…] = CONCATENATE(!’Month’,!’Year’))
DDE supported because the size of the Month dimension x  the size of  the Year dimension is less than 10.000 cells.

Symptoms
  • Large evaluation counter of some rule(s), typically including large zero or null result counter in calculation statistics (if 0 is used instead of null)
Solution
  • Try to eliminate unsupported DDE function usage if possible
  • Create marker for such a rule

Unexpected cell results

Symptoms
  • Results that do not match your expectations, typically resulting in null (0) where you expect some number.
  • Markers are defined in the rule calculating the cell or some source cell.
Solution
  • The Trace Cell Value feature in Excel Add-in lets you expand a value in a cell to the primary data level. In an Excel report, move your mouse cursor to a specific cell and right-click to open the window that allows you to recursively trace the whole cell calculation:

If the Cell Tracer result is different from what you can see in the Excel report, and your model uses markers, make sure these are correctly defined. In case markers are used, and the rule is not fully DDE compatible, the report could show 0 as a result but Cell Tracer shows a non-zero value. Of course, if other cube cells depend on this problematic rule, the error is propagated using the logic of dependent rules.

Example of rules with incomplete markers:

[] = if (['a'], ['b'], [['c']]) // [[‘b’]] should be used.

[] = ['a'] +1 @ [['b']]  // rule always has a nonzero result, but external marker [[‘b’]] can cause that MDE skips cells for which [‘b’] is null and returns null as a result.

image_pdfimage_print