Analysis, Reporting, Visualisation with EXCEL

Each program has its advantages. SQL is very fast to answer questions on data. It is quite useful to pull fast insights, especially if you can directly write code to the database, i.e. in your ERM system. But it is not that good for reporting. When you need to do calculations on the results and present them in special formats, there are much stronger tools like Excel and Power BI.

Excel is among the most powerful tools to prepare your bespoke reports. One of its many uses is to perform specific calculations, report the results in desired outlay. If you need to report the percentage of the sales to customers from European countries to Total Sales, you can do this with one formula in a cell in Excel.

Last week we analysed the Lead Merchants sales database with SQL. SQL is efficient if you need standard reports like in the sales table. Each column has its unique content; each item of the same category can be under the corresponding column content in all rows.

Mostly we prepare reports like a Balance Sheet with the ageing breakdown, so it starts with the row ‘cash’ all amount is in ‘on-demand’ column, then the row ‘fixed assets’ for useful life broken down into years in columns. Then the row ‘receivables’ is broken down into ‘due’, ‘due in 2 weeks’, ‘1 month’, ‘3 months’, ‘1 year’ in the report columns. In such cases, the tool we want to use for reporting in Excel.

In this part of the workshop, we will first start with typical features of Excel and in the second part we will go on with the power features.

Analysis with Formulas and Pivot Tables

In classic Excel, there are a few different ways you can reach the desired outcome; using formulas, features like sort, filter, subtotals and pivot tables. We will do all in the first part, preparing an Excel report and a dashboard. My experience is that using only formulas is the best way to prepare and automate bespoke reports.

Excel is among the most powerful tools for bespoke reporting and analysis. But it comes with a few bottlenecks: there is a limit to the data you can load and analyse at one shot, which is around 1 million lines (1.048.576 rows to be exact), and it also has performance problems. Even if your data is much less than a million lines when you work on complicated reports such as the balance sheet ageing report, Excel gets slower and slower, at times it can be frustrating.

Lead Merchants database has 5 tables with orders table consisting of 2 million rows of data. There are still ways to work with that data in classic Excel but to prevent time loss and frustration we pulled summary data from SQL. It is the flat file we prepared in SQL, which means all tables of the database are combined in one big table, and data is summarised in a way that it has the least number of rows, which will allow us to do our analysis. For me, this was the best use of SQL for a long time, summarising the data to work in Excel; then came the power features.

The flat file we got from SQL has 229.560 rows of data + 1 header row. 2 million lines are summarised to around 1/7 of the original size. The summary could be done a few different ways; for our purpose, we think that monthly reporting figures would be enough for our analysis, reporting on daily figures was not necessary. We took out the day information, summarised the data for each month; meaning if a customer bought a product more than one time in a given month this data is decreased to one line, keeping all the column information the same but only aggregating the numeric figures such as total units sold. If the customer bought another product in the same month, this is another line.

If we needed the weekly figures, the summary data would be much bigger. If we needed the daily figures summarising over dates would not be possible.

Below are the summary data file and Excel workbook used in the analysis. For the complete database and related files follow the link.

Now let’s start our analysis. We will start with the Report, then the Pivot and finally the Dashboard.