ICAEW.com works better with JavaScript enabled.
Exclusive

Getting Internet data into Excel just got easier

Author: Simon Hurst

Published: 18 Dec 2023

Exclusive content
Access to our exclusive resources is for specific groups of students, users, subscribers and members.

Trying to link Power Query to data on a web page could be tedious at best. An enhancement to the ‘From Web’ data connector makes things a lot easier and uses AI to help you get the information that you need.

Introduction

Some years ago, Excel was able to extract information from a web page by identifying the tables on the page and allowing you to select those that you wanted to use. The new, Power Query based Get & Transform Data tools include a From Web command. Until the recent enhancement, this was a nightmare to try and use. If you were lucky you might come across a recognisable table, but:

Screenshot from an excel spreadsheet

However, for many web pages you were more likely to have to go through an extensive trial and error process of expanding column after column to try and find the information that you needed:

Screenshot from an excel spreadsheet

We looked at using the previous method a few years ago:

From Web in action

The new, enhanced ‘From Web’ tool does a much better job of identifying usable information within the target web page.

Here, we have used the Data Ribbon tab, Get & Transform Data group, From Web command to display the From Web dialog. If you just need a simple connection to a URL, you can use the Basic form of the dialog. For more complex requirements, choosing Advanced will display additional choices:

Screenshot from an excel spreadsheet

For our example, the Basic version is all we need, so we can just type in, or paste in, the required URL and press OK:

Screenshot from an excel spreadsheet

The dialog now shows a set of Suggested Tables from which you can choose the table, or tables, that best suit your requirements before either loading the table to a worksheet directly using the Load button or using the Transform Data button to open the table in the Power Query editor for further information processing.

There are some additional new features. You can link to the HTML Code of the web page or even extract all the ‘Displayed Text’. However, perhaps the most useful of the new features is the ‘Add Table Using Examples’ button. This works in a similar way to the Column From Examples feature in the Power Query Editor Add Column Ribbon tab. You can type in entries from a display of the web page or copy and paste them into the sample columns provided. Once enough data has been added for Power Query to work out a pattern, it will fill in the rest of the column for you:

Screenshot from an excel spreadsheet

Additional sample columns can be displayed by clicking on the column heading plus sign:

Screenshot from an excel spreadsheet

For our third column, we have started to type in some of the text from web page. Power Query will display an AutoComplete list of items that match what has been typed.

Once you have created the columns, or individual entries, that you want to work with, you can click the Ok button and a Custom Tables group will be created with your new table in it. Selecting the table will display a preview in the Table View window:

Screenshot from an excel spreadsheet

Here is the data from the web page in the Power Query Editor:

Screenshot from an excel spreadsheet

Incidentally, you will notice that Column2 apparently contains numbers, but Power Query has left the Data Type as text. Trying to change the Data Type to Decimal Number will work for the positive numbers but generates an error for the minus values. It took be a little while to work out what the problem was, but I eventually realised that the minus sign wasn’t a minus sign, but a different sort of dash character. Right-clicking on the top item and using Replace Values… to replace this character with a normal minus allowed Power Query to convert the column Data Type to Decimal Number.

The table can now be loaded into a worksheet where it can be refreshed manually, or set up to refresh periodically, to bring in the latest data from the web page.

Conclusion

For more articles on Power Query, search the ICAEW Excel Community article archive. Power_Query is a prominent term include in the Keywords word cloud.

Archive and Knowledge Base

This archive of Excel Community content from the ION platform will allow you to read the content of the articles but the functionality on the pages is limited. The ION search box, tags and navigation buttons on the archived pages will not work. Pages will load more slowly than a live website. You may be able to follow links to other articles but if this does not work, please return to the archive search. You can also search our Knowledge Base for access to all articles, new and archived, organised by topic.