This article compares data analysis using Web Intelligence to the same workflow using Microsoft Excel and for the given scenario demonstrates why using Web Intelligence is quicker, easier and less error prone.
The objective here is not to show that in all cases Web Intelligence is better than Excel rather to highlight that there are some situations where it is preferable to use Web Intelligence. Equally there are some situations where Excel is better than Web Intelligence, for example, calculating the geometric mean is easier in Excel as we can use a built in formula whereas in Web Intelligence we have to calculate the geometric mean manually.
First let us consider he following typical business scenarios,
- The head of account management is looking at reassigning clients to account managers in her team. To do so she first wants to get a list of accounts that Bob has dealt with that Sarah has also dealt with.
- A marketing analyst would like to get a list of products that are no longer in the top 100 this month that were top 100 last month.
- Also from marketing analysis we want to review customer segmentation. We want to know how many of our customers are in the age group 20 to 25 and how many live in a large city and how many own a car. We also want to know how many fall into each of these three segments.
Each of these scenarios are simple enough to understand but doing the performing the analysis can get quite complex as in each case we are having to perform set analysis. Set analysis is all about identifying one group of customers, products, clients etc and then seeing how this set compares to another set.
In the first scenario above the two sets are Sarah’s clients and Bob’s clients and we want to see what clients in one set belong to another. In the second scenario one set is the top one hundred products this month and the second set is the top one hundred products last month. And in the last scenario we have three sets: customers aged 20 to 25, customers who live in a large city and customers who own a car.
Set operations are often represented as Venn diagrams where the two circles represent each set and the different areas represent the different relationships. The first scenario is displayed as a Venn diagram in figure 1 below.
and the third scenario above is represented by this Venn diagram,
Almost all leading databases support set operations and we can use Web Intelligence to make use of these to perform our set analysis. Microsoft Excel however is not particularly good at set operation, although it can be done with a bit of hacking around.
Let us look at how Web Intelligence compares to Excel for this type of analysis.
Set Analysis using Excel
Let us look at how we might perform the first scenario in a spreadsheet. Consider a spreadsheet where we have two columns: a list of Sarah’s Clients and a list of Bob’s Clients and we want to produce a list of clients that belong to both Bob and Sarah.
We can of course just visually inspect the two lists and can spot which clients are common to both for example “Initech” and “Roboto Industries” are in both lists. We could then just write down the clients common to both. This is fine for a small list such as above but not so practical if the lists contains hundreds or thousands of entries.
For longer lists we can make use of the vlookup function.
Set Analysis in Excel using the vlookup function
Beside the two columns of data we add a third column and enter the formula,
and copy this down over all rows. This will give a result similar to the screenshot below
This is effectively taking a value from Bob’s list (range C3 to C21) and then looking up this value in Sarah’s list (range B3 to B21). When it finds a match it returns the value otherwise it returns an “#N/A” error.
So we have now identified a list of clients common to both lists however all these “#N/A” statements are a bit ugly but we can hide these by updating our formula to,
We now have a list of clients common to both Bob and Sarah but we still have blank rows for when there is a client in common. To remove these we can use the filter function to hid the blank rows. However this also affects the original lists so not ideal but we can copy values of the final list to another spreadsheet.
As we can see it is not impossible to identify which clients are common to both Bob and Sarah using Excel but it isn’t particularly straightforward. Furthermore if we have large data sets or we are trying to analyse three or more sets of data then this method becomes quite time consuming and is prone to copy/paste mistakes and other user errors.
Let us now look how we can perform the same analysis using Web Intelligence
Set Analysis using Web Intelligence
We have a universe that we can use to get a list of clients that an account manager has had contact with so first we’ll create a query that lists Sarah’s clients.
And this produces a simple table listing Sarah’s clients. We now edit the query and click the set analysis icon to create a second query, here we update the query filter to now list Bob’s clients. We also change the set operator from “union” to “intersection” and run the query. For more information on this feature see “Using combined queries” in the Web Intelligence user guide “Building Reports with the SAP BusinessObjects Web Intelligence Java Report Panel”.
This then produces a list of clients that both Sarah and Bob have had contact with,
As can been seen this is much quicker and easier than hacking the data in Excel. In addition we can easily change the set operator to the minus set operator to determine which clients Sarah has had contact with that Bob has not.
Is this the only way of doing set analysis in Web Intelligence?
Well it is certainly the easiest, the other option is to bring both sets of data into the report and then analyse the data there however Web Intelligence doesn’t provide any set analysis features or formula so it is a little tricky but it is still possible.
Report side set analysis
Here we create two separate queries and then integrate the result in the report. This could be useful if we want to display a list of Sarah’s clients, a list of Bob’s clients and a list of clients common to both. Let us look at how we might do this.
We create a report that contains two queries. Each query is the same as shown in figure 5 above where in the Result Objects we have client name and in the Query Filters we have “Employee Name equals Sarah” in one and “Employee Name equals Bob” in the other query.
When we run this report Web Intelligence will by default enable the merge dimension feature to integrate the results of the two queries and the screenshot below shows the result. For more information on merged dimensions see the chapter “Merging dimensions from multiple data providers” in the Web Intelligence user guide.
The report displayed above in figure 8 is the result from this query and here we are displaying three columns. The first displays the result set of the query returning Sarah’s Clients, the other Bob’s clients and in the middle we have the result of the merged dimension.
When merging these two queries Web Intelligence is performing a union of data from both queries (in SQL language a full outer join). What we want however is an intersection of these two data sets (or normal join in SQL).
Unfortunately there isn’t any feature where we can control how Web Intelligence merges the result sets, for example instructing Web Intelligence to use an intersection or indeed a minus set operation rather than a union. However there is a workaround.
We add a new column to the middle table and enter the following formula,
=Count([SarahsClients].[Client Name]; All)
This returns 1 when the client belongs to the set of Sarah’s clients otherwise it returns zero. We then update this to,
=Count([SarahsClients].[Client Name]; All) + Count([BobsClients].[Client Name]; All)
And this returns a 2 when the client exists in both Sarah’s and Bob’s client list.
Finally we create a new variable of this formula which we can then use to filter the table to only display the clients common to both Sarah and Bob.
Above has shown that both Excel and Web Intelligence can perform Set Analysis. In Excel it requires a reasonable amount of manual intervention and the use of the vlookup function to establish a list of members common to both sets. This manual effort is time consuming and is prone to user errors.
In Web Intelligence the easiest method is to use the set analysis feature of the query to return our desired set. We also saw that it is possible to perform the same analysis in the report through use of a relatively simple formula although it would be better if we could specify that the merge dimension feature should use an intersection or minus when merging two data providers.
So it is reasonable to conclude that set analysis is best performed using Web Intelligence.
This article isn’t trying to show that in all cases you must use Web Intelligence instead of Excel but is aiming to highlight that as an analyst you should know both products and be able to decide which tool is best for a given scenario.