Chat with us, powered by LiveChat

See how Adaptiv can transform your business. Schedule a kickoff call today

Databricks' Serverless SQL: Build & Refresh a Calendar Dates Table

  • Technical

Introduction

In the world of Business Intelligence (BI), a well-structured calendar dates table is an invaluable asset. This fundamental component serves as the backbone for date-based analysis, enabling accurate reporting across various date dimensions. Whether you are tracking sales trends, analysing employee performance, or forecasting future metrics, a robust calendar dates table is essential for deriving meaningful insights from your data.

Databricks, with its SQL engine, offers an ideal environment for creating and maintaining such a crucial element. By leveraging Databricks’ capabilities, organisations can automate the process of generating and updating their calendar dates table.

In this blog post, we will explore how to harness Databricks’ serverless SQL and workflow (scheduling) functionalities to build and refresh a calendar dates table tailored for Business Intelligence applications.

We will cover the creation of the table itself, setting up a workflow. By the end, you will have a clear understanding of how to implement this solution.

Create the date table

Previously, we used to leverage a Python-driven notebook to build and refresh our data. We decided to now leverage SQL instead of Python as most of our customers are more at ease with the latter.

First things first, we need to define the start and end date of our table. In order to do so, we will leverage the SQL function sequence. Long story short, the sequence function is used to create an array of elements from start to stop and incrementing by a step. The below code will generate an array containing dates from January 1st 2020, to January 1st 2050.

Now the second step is to literally explode that array into proper rows. In order to do so, we will leverage the SQL function explode. Basically, this function will return a set of rows by un-nesting our array (collection).

In two very easy and simple steps we now have the foundation of our date dimension.

Next steps is to capitalise on those two steps and start creating various date related columns based on the value of the calendar_date column. To do so, we will be using a CTE (Common table Expression) definition to factorise and reuse our code in a neat manner.

Now comes the most interesting part which is defining and adding the Business Intelligence relevant date attributes. The below query enriches our table with additional attributes useful for BI reporting, including fiscal periods (assuming a fiscal year starting April 1).

Many of the built-in SQL functions available in Databricks will look very familiar to experienced data engineers who have worked with on-premises relational database management systems (RDBMS) like Oracle, SQL Server, or PostgreSQL. This similarity eases the transition for those moving from traditional data warehousing to Databricks.

For instance, in our calendar dimension example, functions such as year(), month(), dayofweek(), and weekofyear() are nearly identical in syntax and functionality to their counterparts in traditional RDBMS systems. This allows data engineers to leverage their existing SQL knowledge when building date-related dimensions and fact tables in Databricks.

with calendarDates as (

  select

    explode(array_dates) as calendar_date

  from

    (

      select

        sequence(

          make_date(2020, 01, 01),

          make_date(2050, 01, 01),

          interval 1 day

        ) as array_dates

    )

)

select

  10000 * year(calendar_date) + 100 * month(calendar_date) + day(calendar_date) as calendar_date_key,

  to_date(calendar_date) as calendar_date,

  year(calendar_date) calendar_year,

  month(calendar_date) calendar_month,

  day(calendar_date) calendar_day_of_month,

  date_format(calendar_date, 'EEEE MMMM dd yyyy') calendar_date_full,

  date_format(calendar_date, 'EEEE') calendar_day_name,

  date_add(to_date(calendar_date), -1) as calendar_date_previous_day,

  date_add(to_date(calendar_date), 1) as calendar_date_next_day,

  case

    when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)

    else date_add(calendar_date, -(weekday(calendar_date)))

  end as calendar_week_start,

  date_add(

    case

      when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)

      else date_add(calendar_date, -(weekday(calendar_date)))

    end,

    6

  ) as calendar_week_end,

  weekday(calendar_date) + 1 as calendar_week_day,

  weekofyear(calendar_date) calendar_week_of_year,

  date_format(calendar_date, 'MMMM yyyy') calendar_month_year,

  date_format(calendar_date, 'MMMM') calendar_month_name,

  date_add(last_day(add_months(calendar_date, -1)), 1) calendar_first_day_of_month,

  last_day(calendar_date) calendar_last_day_of_month,

  case

    when month(calendar_date) in (1, 2, 3) then 1

    when month(calendar_date) in (4, 5, 6) then 2

    when month(calendar_date) in (7, 8, 9) then 3

    else 4

  end AS fiscal_quarter,

  year(date_add(calendar_date, 89)) AS fiscal_year,

  case

    when to_date(now()) = calendar_date then true

    else false

  end as current_day,

  CASE

    WHEN to_date(now()) BETWEEN (

      case

        when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)

        else date_add(calendar_date, -(weekday(calendar_date)))

      end

    )

    AND (

      date_add(

        case

          when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)

          else date_add(calendar_date, -(weekday(calendar_date)))

        end,

        6

      )

    )

    THEN true

    else false

  end as current_week,

  case

    when month(to_date(now())) = month(calendar_date)

    and year(to_date(now())) = year(calendar_date) then true

    else false

  end as current_month,

  case

    when year(to_date(now())) = year(calendar_date) then true

    else false

  end as current_year

from

  calendarDates

limit

  10;

Create our notebook

Now that we defined our SQL, we need to encapsulate it inside a Create Table statement, then we will parametrise our notebook to pass 2 parameters (widgets) defining which catalog and schema to use.

We will put that in motion directly in a Notebook as per the below:

The full notebook can be accessed here:

-- first let's set the timezone to the correct location. UTC is behind NZST or NZDT.
SET timezone = Pacific/Auckland;

-- let's use the right catalog and schema passed as parameters
use catalog ${catalog_name};
use schema ${schema_name};

-- create a managed table in our Unity Catalog
create or replace table calendar as
-- CTE to simplify our SQL
with calendarDates as (
select
explode(array_dates) as calendar_date
from
(
select
sequence(
make_date(2020, 01, 01), -- start date
make_date(2050, 01, 01), -- end date
interval 1 day -- incremental step
) as array_dates
)
)
-- the SQL transforming our main calendar_date into all the colunmns we will be requiring
select
10000 * year(calendar_date) + 100 * month(calendar_date) + day(calendar_date) as calendar_date_key, -- calendar table primary key
to_date(calendar_date) as calendar_date,
year(calendar_date) calendar_year,
month(calendar_date) calendar_month,
day(calendar_date) calendar_day_of_month,
date_format(calendar_date, 'EEEE MMMM dd yyyy') calendar_date_full,
date_format(calendar_date, 'EEEE') calendar_day_name,
date_add(to_date(calendar_date), -1) as calendar_date_previous_day,
date_add(to_date(calendar_date), 1) as calendar_date_next_day,
case
when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)
else date_add(calendar_date, -(weekday(calendar_date)))
end as calendar_week_start, -- start of week date
date_add(
case
when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)
else date_add(calendar_date, -(weekday(calendar_date)))
end,
6
) as calendar_week_end, -- end of week date
weekday(calendar_date) + 1 as calendar_week_day,
weekofyear(calendar_date) calendar_week_of_year,
date_format(calendar_date, 'MMMM yyyy') calendar_month_year,
date_format(calendar_date, 'MMMM') calendar_month_name,
date_add(last_day(add_months(calendar_date, -1)), 1) calendar_first_day_of_month,
last_day(calendar_date) calendar_last_day_of_month,
case
when month(calendar_date) in (1, 2, 3) then 1
when month(calendar_date) in (4, 5, 6) then 2
when month(calendar_date) in (7, 8, 9) then 3
else 4
end AS fiscal_quarter,
year(date_add(calendar_date, 89)) AS fiscal_year,
case
when to_date(now()) = calendar_date then true
else false
end as current_day, -- true if calendar day is today
CASE
WHEN to_date(now()) BETWEEN (
case
when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)
else date_add(calendar_date, -(weekday(calendar_date)))
end
)
AND (
date_add(
case
when date_add(calendar_date, (weekday(calendar_date) + 1) -1) = calendar_date then to_date(calendar_date)
else date_add(calendar_date, -(weekday(calendar_date)))
end,
6
)
) THEN true
else false
end as current_week, -- true if calendar day is in current week
case
when month(to_date(now())) = month(calendar_date)
and year(to_date(now())) = year(calendar_date) then true
else false
end as current_month, -- true if calendar day is in current month
case
when year(to_date(now())) = year(calendar_date) then true
else false
end as current_year -- true is calendar days is in current year
from
calendarDates
-- modify our calendar table in order to set a PK (primary key)
ALTER TABLE calendar
ALTER COLUMN calendar_date_key SET NOT NULL;
alter table calendar add primary key (calendar_date_key);

Next steps:

  • Modify the notebook to add the columns that you need for your use case.
  • Test and test.
  • Create a workflow to schedule and run the notebook.
  • Create Dimension views of your calendar and start using it!

Ready to elevate your data transit security and enjoy peace of mind?

Click here to schedule a free, no-obligation consultation with our Adaptiv experts. Let us guide you through a tailored solution that's just right for your unique needs.

Your journey to robust, reliable, and rapid application security begins now!

Talk To Us