Citi Bike Demo

Yesterday I attended a free workshop put on by Snowflake. The session entitled “Zero to Snowflake in 90 Minutes” provided information on Snowflake’s Architecture, Performance and Scalability as well as a “hands-on” demo. Snowflake touts itself as “The Data Warehouse Built for the Cloud” and is gaining enterprise customers at a dizzying pace.

The “demo” used data from Citi Bike – New York City’s bike share system. Citi Bike is the nations largest bike sharing service. The data can be downloaded from: https://www.citibikenyc.com/system-data

The workshop provides an introduction to how to setup and use Snowflake. The outline is below and the lab takes 90~ minutes:

Lab Overview
Module 1: Prepare Your Lab Environment
Module 2: The Snowflake User Interface & Lab “Story”
Module 3: Preparing to Load Data
Module 4: Loading Data
Module 5: Analytical Queries, Results Cache, Cloning
Module 6: Working With Semi-Structured Data, Views, JOIN
Module 7: Using Time Travel
Module 8: Roles Based Access Controls and Account Admin
Module 9: Data Sharing

I found the workshop very interesting and for two reasons. First, it covered all the basics of using a cloud based database. Users loaded data from a S3 bucket, parsing both csv and json files. Queried the database and managed schema’s and security. The second reason why enjoyed the session is because Qlik’s Elif Tutuk used this dataset for a Qlik Sense Demo app.

I found a copy of the old Qlik Demo app and set it up on a Qlik Sense instance.

I created a ODBC connection (using a DSN) and was able to update the data from Snowflake. The combination of Qlik Sense and Snowflake is compelling. I liked the Snowflake demo especially when I could match it up with the visualizations from Qlik Sense.

Credit Scoring Model

In late 2018-2019, I worked on a Credit Scoring model. As part of the work I wrote a “whitepaper” outlining the process, methodology and results.

A copy of the whitepaper is available for download: https://www.ericfrayer.com/wp-content/uploads/2019/10/Credit_Scoring_Whitepaper_v1.pdf

As part of the project I used R and SPSS for model construction and verification. The output from the model is available here:
https://www.thinkdataanalysis.com/DataAnalysis_R.html

Microsoft Analysis Services

In 2012, I picked up a copy of Teo Lachev’s “Applied Analysis Services.” The book featured how Microsoft was pulling together Excel, Power View, Power Pivot, Tabular Modeling and the new DAX Language (Data Analysis eXpressions). The traditional OLAP SSAS MDX cube wasn’t going away but the new hardware options and increased need for self service meant a new technology was required. DAX uses standard Excel formula syntax. This provided business users with a way to extended Excel logic, formulas and calculations. The power of Excel with the promise of self service BI is pretty compelling.

During my time at Qlik (2012-2017), Microsoft continued to build and expand it’s products. With the Tabular model , Microsoft adopted a “columnstore indexing” strategy using Vertipaq. This allowed for much more data to be available on disk and in-memory.

For more info visit: https://docs.microsoft.com/en-us/analysis-services/

Google Analytics

This is just a quick post to share I’m using this site mainly as a “technical sandbox”. Someplace to try out different functionality and post working examples. I’m using Google Analytics in a browser and app on my phone to see if I’m getting any traffic. Most of my “users” are friends, colleagues and potential employers who’ve I’ve given the url.

Anyway here is a page from Google Analytics for my site. I guess it shouldn’t be surprising how much Google provides for developers and internet users.

Learning Online

It’s amazing how many free or low cost learning solutions exist online. Currently, I’m using DataCamp, Udemy, and Code Academy. I’m also following specific experts including Nathan Yau, Hadley Wickham and Graham Williams.

New format coming!

Thanks for stopping by EricFrayer.com. Over the years, I’ve used wiki’s, blogs and other content sharing tools to post thoughts, tips and reference materials. Most of this was internal to the companies I’ve worked for. Either on “SharePoint Intranets” or “Confluence” pages or other web based knowledge management or content sharing sites.

At this point, it makes sense to post “samples of work” to build out my professional online resume. I’m using AWS and Azure to host this content. My interest is in finding insights and making data actionable. Not just buzzwords but actually demonstrating how all the pieces come together to give the end-user meaningful analytics.

I’ve been thinking about this for 10 plus years. Now it’s time to share!