SOG Consistency Sheet
In this series, we will be building a Google Sheet that fetches NHL players’ shot data, which updates based on the player the user selects. If you don’t care about this specific metric or even hockey in general, don’t worry – my goal is that you take away some key concepts so you can create your dream cheat sheet.
I’ve built a template which we will use as a starting point. You can get a copy of the template here, but be sure to use it in Google Sheets, as downloading and using this sheet using any other spreadsheet application will not work. To get started, Once you’re in the sheet, you’ll see three tabs; Dashboard, Data, and Index. We’ll get into more detail soon, but here is a quick overview of the three tabs:
Dashboard
This is the main tab, and for the most part, the only tab we’ll be using once we’re done building the sheet. The Dashboard is simple and features a player selection tool, and the SOG Consistency % panel, which will tell us how often a player has X amount of shots.
Data
The data tab is where we will temporarily import the selected player’s 22-23 game log, and where we will extract the information that will ultimately display on our Dashboard tab. A great data source is what makes this all possible, so I would like to give a huge shoutout to the fine folks at QuantHockey, the site that we’re using for this tutorial. I highly recommend checking out the site, where there are easy-to-use tools to slice and dice stats in every way imaginable. You can also find them on Twitter @QuantHockey.
Index
The index is what powers the player selection feature. The index is just a long list of player names, along with a few other pieces of data, with the most important being the players’ Quant Hockey ID. Once a player’s name is inputted on the Dashboard, it will search the index to see if the player exists. If they do, we’ll be able to easily get any other piece of information about that player from the index, which in our case, is going to be a link that Google Sheets can use to fetch data on demand. Building out these indexes can take a while, and is painstakingly boring, which is why I’ve included an index of over 500 players to help get you started.
Setting Up the Dashboard
The first thing we’re going to do is set up our select player feature. Ensure cell D2 is selected, and then click Data > Data Validation, and choose Add Rule. Under Criteria, Select Dropdown (from a range). Copy and paste “Index!B3:B603” into the first field, and click the green “Done” button. This means our index can hold up to 600 players, as it’s looking from row 3 to 603. If you want to add more players, just increase the second row’s number. If you want your chart to look exactly like the screenshots, click Data > Data Validation once again and scroll down to Advanced options, and under Display style, select Arrow. Before moving on to the next step, test your drop-down menu by typing and choosing a player of your choice.
Setting Up the Data Tab
In this step, we’ll work on the tab that makes this all work, and learn about two unique Google Sheet formulas.
First, we’ll want to pull the selected player’s name onto our Data sheet. While on the Data sheet, select cell D2, and type “=” in the formula bar. From here, select the Dashboard tab, click on cell D2 and press enter. We’ll now have easy access to whichever player is selected on the Dashboard tab.
Second, we’re going to get the team the player is on, which we’ll need later. To do this, we’re going to use the VLOOKUP function to get this piece of information from our data tab. Select cell D3 and copy and paste the following into the formula bar and press enter:
=VLOOKUP(D2, Index!B3:Index!G603, 2, FALSE)
Next, we’ll need to get the Quant Hockey ID for our selected player. Select cell D4 on the Data sheet, and paste the following into the formula bar:
=VLOOKUP(D2, Index!B3:Index!G603, 3, FALSE)
This formula looks for the selected player in our index and returns the 4th column of information, which in our case is that player’s Quant Hockey ID. Our Data tab should be looking something like this:
We now need to turn the Quant Hockey ID into a usable link. To accomplish this, we’re going to use a formula which will allow us to combine multiple pieces of text together. On the Data Tab, select cell D4 (where the Quant ID currently is), and in the formula bar, copy and paste:
=CONCATENATE(“http://www.quanthockey.com/game-logs/en/game-log.php?player=”, VLOOKUP(D2, Index!B3:Index!G603, 3), “&season=2022-23&st=r”)
NOTE: If you receive an #ERROR! after entering in the formula, don’t worry – it’s likely an issue with the quotation marks in the formula not copying correctly. To fix this issue, you can try one of two things. First, copy the formula again, and to paste it in the formula bar, press Ctrl + Shift + V, which will paste the formula as plain text. If this doesn’t solve the problem, manually replace the quotation marks in the formula by hand, which should solve this issue. Keep this in mind moving forward, because if you experienced an error with this formula, it may happen again.
There might seem like a lot going on here, but it’s actually pretty simple. On the Quant Hockey site, every player’s game log can be accessed with the same link, with the only difference from player to player being the Player ID. What we’re essentially doing is taking that link and making it on the fly using the CONCATENATE formula, by inserting our player’s ID into a Quant Hockey link. One benefit to doing it this way is that come next season, all you’d need to change is 2022-23 to 2023-24 in the CONCATENATE formula and you’ll have brand new data.
The link that we’ve just created will update for whichever player you select on the Dashboard, and now that we have that piece of information, we can use one of Google Sheet’s most powerful formulas: IMPORTHTML. While still on the Data tab, select cell B7, and copy and paste the following in the formula bar:
=IMPORTHTML(D4, “table”,”1″)
You should see the sheet update with the game log of whichever player is selected on the Dashboard tab. IMPORTHTML imports the contents of a specific section of a website, and in our case, it’s going to the Quant Hockey link of our selected player and pulling in the 1st table on the page.
Right now, the query is pulling in more information than we need, so we’re going to use the QUERY function itself to restrict which information is imported. Once again, select cell B7, delete the formula that is there now and copy and paste this updated formula:
=QUERY(IMPORTHTML(D4, “table”,”1″), “SELECT Col1, Col2, Col3, Col4, Col5, Col6, Col7, Col39 Where Col1>0 ORDER BY Col1 Asc”, 0)
In this formula, our QUERY formula is doing 3 things. First, it selects only the columns that correspond to the stats we are looking to import (Game #, Date, Game, Score, G, A, P and Shots), selects only the rows that have a game number (which prevents us from importing things like header rows that we don’t need) and sorts it so that the most recent game is at the top of the list, which will be helpful later.
TIP: You may be wondering why we’re limiting the number of columns with the QUERY function at all. First, I wanted to make it a bit more manageable for tutorial purposes by only pulling in the data we needed. Second, as you start to do more and more with Google Sheets, limiting the total number of IMPORTHMTL functions used, and the amount of data they pull in with QUERY, will improve the performance of your sheets.
Lastly, we’re going to add a column to easily tell us whether it was a Home or Away game. In cell J7, copy and paste the following into the formula bar:
=IF(B7>0, IF(LEFT(D7,3)=$D$3, “AWAY”, “HOME”), “”)
This formula looks at the game column (column D) and determines if the player’s team is on the left side of the @ symbol. If it is, it’s an Away game, and if not, it’s a Home game. Once again, select cell J7, and click and drag the small blue square at the bottom right of the cell down to cell J90, which ensures all rows will have this Home/Away data.
At this point, we should now be able to import the game log of any player from our index. In part two, we’ll start populating our dashboard chart with the data we now have access to. Before we wrap up, let’s take a quick look at how to add players to the Index.
Adding New Players to the Index
While on the Index tab, scroll to the bottom of the sheet. To add a player, you’ll need three pieces of information: the player’s name, the three-letter acronym for their team (if you’re not sure of the code, check the Index for another player from the team you’re looking for) and their Quant Hockey ID. Visit QuantHockey.com, and search for the player of your choice. Once you’re on that player’s page, the ID will be the 4-5 numbers after the “=” sign at the end of the URL.
Thanks for reading, and be sure to check out Part 2! If you have any questions, or suggestions on how to improve what we’ve built so far, drop them in the comments below.
[…] for coming back for Part 2 of Automating Sports Data with Google Sheets. In Part 1, we went over our Google Sheet template and what each tab was for, built our player selection tool […]
The CONCANTENATE function doesn’t seem to work with new players added to the index and I have no idea why.
Hey, do you have any idea how to improve the google sheet to include players who change teams during the season. What I mean is the correct indication for home and away matches. You didn’t include such changes in your blog article.
It’s certainly possible, but that is more complicated and outside the scope of what the goal of this article is for, which is simply to highlight the basics of data automation with Google Sheets. To do what you want, you’ll need to update the index to include the player’s new team, and a new column for when the trade occurred. On the data sheet, add a new column for “player team” and update the Home/Away column to reference the player team column. From there you’ll be able to use formulas to decipher which team the player was on for each game. Good luck!
Thanks man. I finally managed to create such a database. Now I have another challenge, I would like to include shot attempts in my statistics, but unfortunately I cannot export them on the quanthockey website. I see you are using moneypuck, but I suspect there may be problems to export the data as before. It looks like moneypuck uses a different format. Do you have any idea?
Unfortunately MoneyPuck will not work as a source. If you want to add attempts, you need to find a different source that works with Google Sheets. Good luck!