Download a PDF of this article below:
Power Query is an incredible tool that can seamlessly extract data from a wide array of sources directly into Excel. With Power Query, the limitations of only working with data within the confines of one workbook are a thing of the past. You can now access data from sources as diverse as relational databases, Azure, and more.
However, I would venture to say that one of the most commonly encountered data sources is still the trusty old comma-delimited value (CSV) file. In this post, I aim to demonstrate how to establish a connection with such a file in a unique manner: when it is not residing on your computer’s local file system but is instead hosted on the web.
To get started, open a blank Excel workbook and navigate to the following: Data > Get Data > From File > From Text/CSV:
The next step is a bit unconventional.
You’re likely accustomed to the typical process of locating and selecting a file on your computer by clicking through directories. However, in this case, the file hasn’t been downloaded to your computer, and there’s no need to do so, especially if it’s hosted on a web page with a “.csv” extension.
All you need to do is copy the contents of a file, such as the one shown below, which contains abalone snail measurements, and paste it into the file explorer:
https://raw.githubusercontent.com/stringfestdata/training-assets/master/datasets/abalone.csv
From here, you can either transform or load the resulting dataset like any other in Power Query:
The dataset appears to be in good shape, so I’ll proceed to load it directly into an Excel table. Now, we have a Power Query-sourced data source that appears just like any other:
Please keep in mind that when you read data from a CSV file hosted online, you are essentially connecting to a live, external data feed. While this eliminates the need for downloading the file locally, it does introduce an element of unpredictability when it comes to data refresh if you’re not familiar with how this data source is maintained.
On the other hand, saving the file locally ensures reliable access and enhances security but requires manual updates and consumes storage space. The choice between these two methods ultimately depends on the trade-off between the need for up-to-date data and the considerations of data stability, security, and storage capacity.
To reassess the source of this query, right-click on the “abalone” query in the Queries & Connections menu and select Edit:
You can then click on the gearwheel icon located next to Data Source in the Applied Steps menu to view the source of this data:
Let’s say I’ve decided to store this file locally at the location C:/MyFiles/abalone.csv
. In this case, I can simply type that file path directly underneath the URL box in the menu:
Could you successfully follow the steps outlined here? Compare your result with mine below:
Do you have any questions regarding the process of reading files into Power Query, whether in CSV format or any other format? Please feel free to ask in the comments.
Leave a Reply