Celebrating 25 Years with the SESUG Conference
11/16/2017 by Ben Zenick Modernization - Analytics
Many SAS users attend SAS Global Forum but never consider the value that the regional conferences provide. The Zencos Data Science team attended the 25th Southeast SAS User’s conference in Cary, NC. The event was held on the beautiful SAS Campus on the first of November. On opening night (and again on Monday) we enjoyed the birthday cake.
Our team presented topics ranging from using SAS Viya with Python to hands-on workshop showing how to use PROC DS2. Over the 2-day period, there were multiple papers presented and we did not get to attend everything that we wanted. These are the ones that we found the most interesting or had an unexpected takeaway. All of the conference papers can be referenced on Lex Jansen’s site.
Rebecca Hayes did a good job explaining how the pieces of the SAS environment contributed to the intermittent errors. I was unaware that, new in 9.4, Grid-launched Workspace Servers default port setting is an ephemeral range rather than its previous static port.
I have some experience with ephemeral ports, so I understood the challenges they can cause when needing to keep port ranges unblocked by firewalls. I thought her description of troubleshooting and resolving a frustrating error was very relatable. Much of the effort in troubleshooting goes into accurately characterizing an error. And, when errors are intermittent and difficult to repeat consistently, it can drive you to pull your hair out.
Mike Jadoo, an Economist at the Bureau of Labor Statistics, describes what an Application Programming Interface (API) is and how they are useful in the data production process. It was rather interesting to see how he uses SAS software to interface with APIs to acquire data for his analysis at the Bureau of Labor Statistics.
A task that used to be accomplished by downloading physical files and then loading the information into SAS can now be achieved by making API calls directly from SAS – making data processing and analysis more efficient. As more companies and government agencies begin using APIs to distribute information, techniques such as these will become increasingly useful to acquire data for analysis. As Mr. Jadoo pointed out in his presentation, SAS software is the perfect tool because not only can you use SAS to get the data, but you can then process and analyze it all in one place!
This paper encouraged SAS developers to continue to do the same quality work we were doing, but better and more efficiently. The lessons learned in the presentation, such using IFC and IFN over traditional IF THEN ELSE logic, are already being leveraged to more efficiently accomplish tasks. The idea of attending these conferences is to improve your skillset and knowledge base. So even little tips like making better use of the functions available helps. I love that taking a step back to dig deeper into Base SAS programming is reaping benefits throughout a variety of my projects.
This presentation (while one of many on the topic) stood out because it demonstrated a wide variety of text mining techniques available to data scientists. Clustering, Topic Extraction, and Sentiment analysis were all performed on twitter data gathered to measure public sentiment and common reactions to the 2016 demonetization that occurred in India. While the results weren’t groundbreaking, the process used was very informative and applicable to other relevant business use cases like customer/employee sentiment analysis, product research, etc.
Troy discussed that the SAS log does more than explain if you have errors in your code. It also shows the performance timing of your job. He provided a macro that assists with providing analysis of your job performance.
I was interested to hear Troy speak after purchasing his new book, SAS Data Analytic Development: Dimensions of Software Quality. His talk gave me some ideas for building some reports in SAS Visual Analytics that would allow customers to see their batch process statistics over time. I liked the concept of comparing your current day’s job against the average of past runs. When you start having server issues or getting too much data, this helps you see the impact over time.