Changing how Cognos writes outer joins
A quick fix you might try, is to wrap your Report Studio filter in a coalesce before comparing to ‘RESIDENTIAL’, like:
coalesce( [field], ‘RESIDENTIAL’ ) = ‘RESIDENTIAL’
This would handle the cases where [field] is NULL (which would be the result of a LEFT OUTER JOIN without a matching value in the right table).
You indicated that you want an unconditional filter to only get rows where
table2.category = ‘RESIDENTIAL’.
so that is what you get.?You only did get rows where table2.category =
‘RESIDENTIAL’, no more, no less.
x is null or x = ‘residential’
That being said my example would be solved by 6campbells solution. I tried to not be too complicated in describing my problem, and unfortunately needed to be a bit more complex.
The particular problem I am having is where the field being filtered on is OK to be null (it is not required in the database) but the field being displayed is required, and I don’t want to see ones where the filter is NULL.
Table_B: ID, Campus, Category
Rows of data in Table_B could be:
2 LAMONI RESIDENTIAL
2 CAMERON DISTANCE
3 CAMERON RESIDENTIAL
4 UNKNOWN UNKNOWN
and if I want Campus if Category is ‘RESIDENTIAL’
I expect (using my previous select from Table_A and Campus instead of Category from Table_B):
2 UNDG LAMONI
3 COMB CAMERON
Just plain Cognos would of course give me:
2 UNDG LAMONI
3 COMB CAMERON
\[Table_B\].\[Category\] = ‘RESIDENTIAL’ or \[Table_B\].\[Category\] is NULL
I would get
1 GRAD HAMPTON
2 UNDG LAMONI
3 COMB CAMERON
Life would be grand if every field all my authors needed to filter on was always required in the DB then this would work, but too often for my taste it is not. And I don’t have power to change that problem I’m handed =)
would work in the example you’ve given. What I’m not comfortable with in that answer is the coincidence of using UNKOWN as the value if [Category] is null and also the value of the field. I’m relatively sure that it is not a sure thing that you can set NULLs to the correct field value in every situation.
I’m royally confused about what you really want. It looks like if you have multiple occurrences of an ID, you only want the RESIDENTIAL row, otherwise you want everything.
I’m going to say LEFT and RIGHT in here. I’m using the LEFT table to mean the table with data that I always want information from and the RIGHT side as the table that it is outer joined to (and I want only some information from).
I want all information from the LEFT table and only certain rows from the RIGHT that match a filter. But I don’t want the filter on the RIGHT to filter the results of the LEFT. By default Cognos filters both. The problem is that sometimes a row can exist in the RIGHT that has data, but the field I’m filtering on is NULL and I don’t want that row (NULL does not match my criteria). So if I use the coalesce or “”or [Field] is NULL”” solution I will return extra rows from the RIGHT. But if I don’t use it then I have the LEFT getting filtered by the RIGHT. Worse is the field I am pulling is user data, not codes. So I can’t just determine which is okay to pull if the field I’m filtering on is NULL and do a complex filter.
Currently we are just creating a query for the RIGHT, doing its filtering and then manually joining to the LEFT table. I can keep doing this, I was just hoping to find a solution that could help keep reports from getting manual joins.
email@removed | 641.784.5056
are surprised when they are informed that outer join semantics means a
result set can return nulls. 😉
Some also do not understand the notion of a pre-join filter, during-join
filter and post-join filter.
( select … where … ) Q1 left outer join ( select …. where… ) Q2 on
Q1 left outer join Q2 on Q1.key = Q2.key and Q2.col = 1234
Q1 left outer join Q2 on Q1.key = Q2.key
Semantics of timing of the operations means that you get different results.
I think you can do this in Framework Manager by defining a query set (with the filter) as LEFT, defining a query set (with the filter) as RIGHT, and joining the two into a 3rd query subject.
if you want to see both ‘Residential’ and nulls…then change the filter to B= ‘Residential’ or null
LEFT OUTER join defined in Framework Manager is not included in
the Native SQL nor a splitted SELECT is generated to execute
This results in wrong data results!
Checking the Relationship SQL in Framework Manager does not
showproblem. Here the LEFT OUTER join is visible, but the result
does not reflect it!
Selecting one item from each table visualize the major
The Cognos SQL includes the LEFT OUTER join.
The Native SQL defines an INNER join within the WHERE clause
because <table1.itemA> = <table2.itemB>!
ReportStudio (and other studios will behave the same way but
notparticularyl tested) generates the same wrong SQL than
FrameworkManager and therefore returns wrong data!
This wrong SQL generation happens for Informix IDS 9.40
and 10. For Informix 11.10 also the native SQL includes the LEFT
We have to modify the cogdmif.ini, parameter Joined=F and
it to ?T? which is n o t the default in 8.4 for Informix IDS
9.40. Then the LEFT OUTER JOIN syntax is generated for Native
SQL. (see IR #615552 too)
Title: Informix and both Inner and Outer Join queries results
-> fixed in 8.4
->Workaround Description [Exceptions Joins] Inner=T
==>> this fix might cause the wrong SQL generation.
Modify the cogdmif.ini for IDS 9.40:
In section [Exceptions Tables Database:IDS INFORMIX9.40]
Joined=F -> change to Joined=T
By default Cognos is creating EXPLICIT outer join syntax. I want IMPLICIT outer join syntax. I can only control with the Governors (and in Reports Studio) the Explicit/Implicit setting for Inner Joins, not outer joins.
States: Regardless of the SQL Join Syntax governor setting, Explicit (inner/outer join) syntax will always be used for right/left/full outer joins.
And likely this is because there is no standard for Implicit outer join syntax in SQL, so looks like I’m the creek so to speak and stuck using multiple queries.
Informix as examples had various limitations. Most of those vendors for
several major releases implemented ISO joined table syntax which avoids the
sematic gaps/incorrect results that those other approaches came with.
While older Impromptu releases would try to use the vendors non-ISO forms
where possible it also moved to using the ISO forms where possible – such as
when Informix IDS put them in around 8/9 days.
So the question to be asked is, are you trying to understand how to get a
correct result set or trying to resolve a performance problem your RBMS
The FM governor you reference is only there because of some older SQL Server
releases a few other cases where some vendors managed to plan a less optimal
plan with a semantically equivalent INNNER join vs FROM-WHERE scenario. This
just allowed someone to try to either style should it turn out their vendor
has poor costing.
select from A, B, C, D where A.col=B.col and C.col=B.col and D.col=C.col
select from A inner join B on .. inner join B on ….
Optimizers in this case can figure out what is the optimal join sequence.
Once you start to use outer joins it means the RDBMS has to start to follow
the join order of the query to provide the required semantics.
If you have the following SQL
SELECT a.col1, b.col1,b.col2
From a left outer join b on a.col1 = b.col1
If the tables contained the data
You would get the following result:
If you now add a where clause to the SQL such as
Where b.col2 = ‘aa’
This will only return the result:
This is standard SQL if you want everything then you would have to explicitly ask
Where b.col2=’aa’ or b.col2 is null
The reason for this is that as soon as you select an item form column b n this case b.col2=’aa’ you are forcing that value to be selected, no matter how often you write the SQL you will only get the result of an equijoin despite having an outer join you could NEVER get any other result!
P Please consider the environment before printing this e-mail
the 6campbells : First, thank you very much for your explanation of the differences in 7 to 8+. I’m looking how to do this for two reasons. First is this is how our authors think. Now, I can train them not to think this way, and things would be a lot better. That will be the way I will have to go. Second is because that is how the data can be best returned. Performance is not the issue: it is returning the correct result set. The inclusion of “” or is null”” (or include missing values in Query Studio) will resolve the vast majority of our needs. The other ones where it won’t we will continue to use multiple queries in Report Studio to resolve.
Thanks everyone for helping with this – in the end it looks to be one of those design choices by IBM that I will live with (and understand why I have to).
occurs due to null values.
‘ABC’ + null = ‘ABC in most business users mind not null
123 * null = 0
count ( <exp> ) = count of all values not just count of non-null
<exp> <predicate> ( ‘abc’, ‘def’, null ) does not need to worry if the
preciate is true, false or unknown
Hence, in some reporting environments you can simulate the ‘intent’ of an
outer join with respect to master-detail patterns but once defined
in a model, etc. you are left with the problem of deciding if you coalesce all
value expressions to a non-null value where you have to seek agreement
as to what nulls transpose to. While they may suggest space=null for chars,
0=null for numerics and intervals you need to determine what business value
will be used
for dates and timestamps (i.e. a value which precedes the existence of the
business/legal entity) but then you have to review what makes sense for time
This is where OLAP expressions (aka MDX) differs from SQL and I’ll leave you
to read up on that if you are interested. 😉
The 6campbells solution is to define a Detail Filter with the “”OR NULL”” syntax.
But how does the report builder know that column comes from an OUTER JOINed table to start with? Especially in Query Studio where you can’t see the generated native SQL?
I would hate to teach every user to use the OR syntax unconditionally (even for INNER JOINed tables)!
I guess you’d say that the report builder has to run the report (without the filter) and recognize that some records have NULLs and then decide whether to eliminate them or not. But that can be really difficult if the OUTER JOINed table correlates highly to the driver table. It might be that very few records contain the NULLs. In fact, it may be the case that none contain NULLs in your test cases. But you have to code for what might happen…especially as new data flows into the system. I don’t believe a report builder could determine that the OR syntax is not needed simply because s/he sees no NULL values in some test results.
I first thought that the problem is that Cognos is assuming a post-join filter (love that clarification from 6campbells) which is problematic and is usually not what is intended. Might it be better to assume a pre-join or during-join filter (don’t they return the same results?). After all, the post-join approach eliminates records, making it very difficult for the report builder to understand what went wrong. (I speak from extremely recent experience!)
The pre-join approach would expose all potential records and it would be clearer to the report builder that a NOT NULL condition could be added to get rid of the unwanted records. But s/he wouldn’t be able to add such a condition! If all such conditions were implemented in pre-join filters, it would have no effect. To get rid of the unwanted records, s/he would have to be able to define a post-join filter.
So where does that leave us?
It seems Cognos ought to provide a toggle (PRE or POST) for each filter (much like the existing Application property (which only applies to Before or After Auto Aggregation).
Maybe Cognos should also indicate which query items are optional (coming from an OUTER JOINed source). Then the report builder would know how to handle them (whether to code the OR syntax or not … and whether to use the PRE or POST property setting).
I’ve used Cognos for nearly two years now and am shocked I didn’t know this before. It seems a gaping hole and one that I’m quite sure has led to the creation of many incorrect reports.
semantics of the predicate logic.
For example, consider:
( select from T1 where T1.C1 = 1 ) T1X inner join ( select from T2 where
T2.C2 = 10 ) T2Y on ( T1X.key = T2Y.key)
( select from T1) T1X inner join ( select from T2) T2Y on (T1X.key =
T2Y.key and T1X.C1 = 1 and T2Y.C2 = 10)
( select from T1) T1X inner join ( select from T2) T2Y on (T1X.key =
T2Y.key and (T1X.C1 = 1 or T2Y.C2 = 10))
The next is to consider how a business user knows if their business
question requires a pre, during or post join predicate. In many cases a
business user has no clue and more often than not null is not a concept
which enters into their vocabulary. A similar comment can be applied as to
if the business question is ‘customer who has bought at least one product’,
‘customers who have bought no products’ vs ‘customers who may have bought
none, one or more products’.
And for that matter, there are persons who may claim to know ORACLE,
Sybase, Informix or other vendor non-ISO joined table syntax but do not
know the semantic bugs in those vendors implementations as to the
application of predicates in the WHERE clause relative to the join
operation let alone combinations of nested/preserved table join patterns
that they may or may not support.
This is distinct from the suggestion from having a property on a filter
that denotes if the desired timing is pre, during or post join… for those
who know that there may be an important distinction.
builder is supposed be able to write a report without writing code.
It’s the responsibility of the metatdata modeller (Framework Manager) to
create the required metatdata and name it appropriately so that users of
the various studios do not fall prey to various reporting traps.
You say in your post … “”Cognos is assuming a post-join filter which is
problematic and is usually not what is intended”” and “”the post-join
approach eliminates records, making it very difficult for the report
builder to understand what went wrong.””
Both these statements suggest that your FM modeller has not fully
understood the business requirement, has not fully tested the model to
ensure accurate/consistent results, or possibly both.
Don’t the first two examples return the same data? The 3rd example is different as it switches from an AND to an OR (whether pre-join or during-join).
Well, unfortunately (or fortunately), we work with a 3rd-party Cognos model, so have no (well, maybe a little) influence on its design.
If we did design the model, it’s an interesting question. Can the trap be avoided? The original post from James is one in which the query subjects are related with an optional relationship and the report builder needs only a subset of the optional table to be considered for inclusion in the report (in his example, only Category = ‘RESIDENTIAL’).
The proffered solution is to add a condition of:
Category = ‘RESIDENTIAL’ or Category is null
This works for his case, but requires the report builder to know that the relationship was optional to start with.
If the modeler knows that RESIDENTIAL data is an important subset, he can model for that. But I don’t think the modeler can project every possible subset that might be used in a report. Even if s/he could, there might be an extremely large number of such possible subsets.
And I’m quite sure that we can’t rule out use of optional relationships.
So it doesn’t seem to me that we could model around this problem, even if we did have control of the model. But maybe I’m missing something. Do you see a way to change the model in this case that allows a user to choose a subset of data from the optional table without falling into the trap?
As @James pointed out, the COALESCE/OR approach only works if the filtered column is not nullable. If it is nullable, the query must be split, the filter applied to Table B, and the join coded manually…which can only be done in Report Studio.
Isn’t a Query Studio user completely out of luck here (given that s/he has no way to split the query into parts)?
And, again, how is such a QS user to know that they are out of luck?
Please note the following:
This will change the way Cognos writes your SQL.I have yet to do a full analysis using Life Cycle Manager of what percentage of our reports will be affected by this change, and what that affect will be.These changes are manual and will require you to do them again every time that you update your CognosAll cogdmif.ini files have to be the same, so find them all on every server your production Cognos uses (and back the old ones up!)They can be undone if they cause you problems.
1. Add the following to the [Exceptions Joins Database:IDS] section of your cogdmif.ini file(s)
2. Comment out the original values for those settings (place a semicolon in front of each line)
3. Restart your Cognos Services
I recommend that you do this on a test environment to check how it will impact your reports. Backing up the original .ini file (rename to cogdmif.ini.old) will allow reverting to be easy.
- Looker Raises $103 M Funding to Accelerate Product Innovation and Operations
- Immuta Adds New Features; Reduces Risk of Data Science Associated with Cloud
- Accenture Acquires Enaxis Consulting, Strengthens Its Data Science and Digital Capabilities
- Getting AP Checks to Print on a HP 9050 Printer
- The ethical harvesting, mining and use of transactional data
- Five Keys for Managing Big Data
- New Sales Tool Delivers Enhanced Lead Intelligence
- Getting Opening and Closing Balance On the Same Table