blog

7 Tips for a Successful SAS ETL Process

08/25/2015 by Maria Nicholson

As a kid, it was fun to take a pack of playing cards, lean one against the other and level by level see how high and fantastic a structure could be built before a slight nudge or breeze or one additional card disrupted the delicate balance and caused it to come crashing down.  Not exactly the image you want evoking with regards to your business intelligence implementation, right?

Unfortunately, this is what will come to mind if you place your end-user applications atop a shaky ETL foundation.  Below I’ve pulled together a few points to consider to help ensure sound construction and BI success.

Do not underestimate the importance of ETL

ETL isn’t sexy but a slick user interface is.  Therefore, it is tempting to be far more interested in how the BI reporting applications look and devote the majority of resources to this effort with the end result being panic because the content is flawed.

No matter how great an application’s user experience is, if the data are wrong the UI won’t matter.

Define clear and consistent business rules

Poorly defined and inconsistent business rules lead to confusion and reporting errors.  The time must be taken to consolidate rules and translate them into usable/reportable fields in the data model.

Eliminate Manual Processing

Manual processing introduces errors and time delays.  It often also means that the staff will need to work off-hours and weekends to complete these steps.

Value Process Over Fancy Tools

ETL process management and modeling tools such as SAS DI Studio and SAS Enterprise Guide can be really helpful in building your ETL solution.  Or you could build the whole ETL process using Base SAS.  No matter which software or packages you chose it will never eliminate the need for good design and planning.

Design a solid data model

You will need a good data model provides business intelligence, in particular, self-service BI, but I think they apply in general.

    • Use a simplified structure that business users have some hope of understanding
    • Know the structure that works seamlessly with the best self-service BI tools
    • Consistent attribute change management
    • Excellent query performance, which comes from several directions:
      • Fewer table joins because decodes and hierarchical structures are collapsed into a simple flat dimension table
      • Simple table joins because an efficient single-column surrogate key always joins facts and dimensions
      • Database engine optimizations that recognize dimensional structures
      • Where possible, pre-calculation of metrics or attributes during the ETL process rather than at query time

Implement data validation

Do not wait until data are loaded into production before reviewing the values.  That results in end-users or customers finding data errors, which could lead to loss of business.

Maintain Your Process

You must allocate adequate budget and appropriate staff to ETL build and maintenance.

It is important to realize that ETL is much harder and more resource-intensive than it may seem.  Having an experienced ETL design and implementation team and the correctly sized infrastructure in place to handle the processing load is money well spent.

These steps should help prevent your team from building a house of cards.