How to access “most viewed pages” data using the Google Analytics API 3.0 and PHP
If you are running Google Analytics on your site, you can use its API to read visitor data and display the information. Here is a quick framework how-to on using the API to get the data to create a “most viewed” widget based on the pages that get the most pageviews. The data is accessed periodically in the background from a script on your website’s server.
Version 3.0 of the Google Analytics API 3.0 implements the OAuth 2.0 protocol. Among the benefits of OAuth are the fact that applications don’t have to store user credential information, and that users can revoke an application’s permissions at any time. However, the OAuth authorization process can be a bit more involved compared to some other API approaches. Also, the Google documentation is a bit light in terms of summarizing the process for server-side data access.
This how-to will use PHP to make the server-side calls to the Google Analytics API, which returns information in the JSON format. The code examples assume that you have cURL for PHP installed, and that you are using PHP 5.2.0 or later in order to use the json_decode function to parse the Google responses.
The setup steps are as follows: We will first set up the application in the Google API console. Then we’ll authorize the application with a Google account that has access to your site’s data. Using the resulting authorization code, we’ll request a refresh token from Google.
From there, you can build the application part: The refresh token is used to get access tokens that are finally used to actually read the data from Google. The details of implementing the data for the widget will be left up to you.
Google API console setup
Log in to the Google API console with a Google account. Note that the account that you log in with here does not need to be the same account that has access to your site’s data.
At the top of the left menu, create a new API project in the dropdown list:
Once your new project is created, select the “Services” page from the left menu, and turn on the Google Analytics API:
Note that there are per-day query limits for each API service.
Select the “API Access” page from the left menu, then create a web application client.
The name of the web application client will be displayed to the user (you) later when you authorize the application.
You will need both the client ID and client secret later, used in a PHP script accessible at the redirect URI, which you can rename to add the “.php” suffix as necessary.
If you need to access data server-side when an authorized user is not logged in, the web application client is not sufficient. You also have to create a server API key, limited to the IP address(es) of your server(s):
Authorize the application
Next, you’ll have to authorize the application to access your Google Analytics data. Remember that the account that created the API application does not have to be the same account that has access to your site’s data.
To prompt the user (you) to authorize your application, you have to build the authorization URL. First, however, we should build a very basic script at the configured redirect URI. The first iteration of the script has this code at http://www.theblog.ca/oauth2callback.php:
<?php var_dump( $_REQUEST ); ?>
This is simply going to output to your screen the raw data that is being sent to the script via either POST or GET parameters. It is not the complete script yet, but will be used for illustrative purposes.
To build the authorization URL (which is a set of GET parameters at the URL https://accounts.google.com/o/oauth2/auth), you’ll need the following information, which is as follows for our example application:
- Callback URL: http://www.theblog.ca/oauth2callback.php
- Client ID: 63311316168.apps.googleusercontent.com
- Client secret: _iXNRZ5zMj1beMTab0wA4lXC
You will also need to specify a response type (always “code”), the type of access you are requesting (“offline” since we are accessing the data in the background without user intervention), and a “scope” specific to Google Analytics. See the Google API OAuth 2.0 documentation for information on the URL parameters and the resulting response.
For our example application, the authorization URL is:
https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=63311316168.apps.googleusercontent.com
&redirect_uri=http://www.theblog.ca/oauth2callback.php&access_type=offline
&scope=https://www.googleapis.com/auth/analytics.readonly
Get the permanent refresh token
When you access the authorization URL, it should prompt the user (you) to allow the application to access the site data. Once you click the authorization button, you should be redirected to your authorization URL, which you will discover has the “code” GET parameter. You will have to use the value of that GET parameter to ask Google for the refresh token.
Our example uses cURL to build the refresh token request, whose URL uses a similar format to the authorization URL, posting a set of fields to https://accounts.google.com/o/oauth2/auth. According to the Google API OAuth 2.0 documentation, you’ll need the client ID, client secret, and redirect URI information from before, as well as the “code” and an extra field “grant_type”, which is always “authorization_code”.
In oauth2callback.php, use the following code:
<?php if( isset( $_GET['code'] ) ) { $ch = curl_init(); $timeout = 5; curl_setopt( $ch, CURLOPT_URL, 'https://accounts.google.com/o/oauth2/token' ); curl_setopt( $ch, CURLOPT_POST, 1); curl_setopt( $ch, CURLOPT_POSTFIELDS, 'code=' . $_GET['code']. '&client_id=63311316168.apps.googleusercontent.com&client_secret=_iXNRZ5zMj1beMTab0wA4lXC&redirect_uri=http://www.theblog.ca/oauth2callback.php&grant_type=authorization_code'); curl_setopt( $ch, CURLOPT_RETURNTRANSFER,1 ); curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout ); $data = curl_exec( $ch ); curl_close( $ch ); $result = json_decode( $data, true ); var_dump( $result ); } else { var_dump( $_REQUEST, true ); } ?>
Note that we are once again using a quick and dirty “var_dump” to output the returned information on the screen. If you visit the same authorization URL in your browser once again, you should get redirected to the same URL as before, but this time showing the response from the extra request for the refresh token. This response should, of course, contain the refresh token.
One extra note: to see which applications you’ve authorized under a Google account, you can go to this URL: https://accounts.google.com/IssuedAuthSubTokens, where you can also revoke access, forcing the application(s) to ask for your authorization again.
Use the refresh token and access the data
The refresh token is the most important piece of information, as it is essentially the permanent key that you need to access the Google Analytics data. (It is permanent as long as the user account that authorized the application does not revoke access.) The more accurate description is that the refresh token is used to get a time-limited access token, and that the access token is used along with the API key you had set up, valid only for specific IP addresses, to access the data.
In our example, the refresh token is:
1/7DRzjcqm-ypH9By1FrY3T-l_oSW3KdklC0LJuZLk5Q0
The following example script is a framework for what you would use with your website, and assumes that you are running it periodically, about once an hour, and therefore you need to ask for a new access token every time. A Google Analytics access token typically has an expiry of just over 1 hour, and thus you can use it for more than 1 request within that hour.
The script gets the top 10 pages based on pageviews for the past 1 day.
<?php // First, ask for an access token using the refresh token $ch = curl_init(); $timeout = 5; curl_setopt( $ch, CURLOPT_URL, 'https://accounts.google.com/o/oauth2/token' ); curl_setopt( $ch, CURLOPT_POST, 1); curl_setopt( $ch, CURLOPT_POSTFIELDS, 'refresh_token=1/7DRzjcqm-ypH9By1FrY3T-l_oSW3KdklC0LJuZLk5Q0&client_id=63311316168.apps.googleusercontent.com&client_secret=_iXNRZ5zMj1beMTab0wA4lXC&grant_type=refresh_token'); curl_setopt( $ch, CURLOPT_RETURNTRANSFER,1 ); curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout ); $data = curl_exec( $ch ); curl_close( $ch ); $result = json_decode( $data, true ); /* You should get a response like this in the result: { "access_token":"1/fFBGRNJru1FQd44AzqT3Zg", "expires_in":3920, "token_type":"Bearer", } Then, you can use that access token to build queries (see http://code.google.com/apis/analytics/docs/gdata/gdataExplorer.html or http://code.google.com/apis/explorer/#_s=analytics&_v=v3 for examples): */ if( isset( $result['access_token'] ) ) { // Look up only the last day of visits $endDate = date( 'Y-m-d' ); $startDate = date( 'Y-m-d', strtotime( '-1 day' ) ); /* "ids" value comes from this URL in the last portion of the URL, after the "p": https://www.google.com/analytics/web/#dashboard/default/a381759w192893p9122283/ Or use http://code.google.com/apis/analytics/docs/gdata/gdataExplorer.html to show the GA ID for each your Analytics accounts "key" is the API key that you'd set up in the Google APIs console, restricted to certain IP addresses */ $url = 'https://www.googleapis.com/analytics/v3/data/ga?' . 'key=AIzaSyDyWgfb45VYfVYdVnmpH4JZCCRNas5P0SE&ids=ga:9122283&start-date=' . $startDate . '&end-date=' . $endDate . '&metrics=ga:pageviews&sort=-ga:pageviews&dimensions=ga:pagePath&max-results=10'; $ch = curl_init(); $timeout = 5; curl_setopt( $ch, CURLOPT_URL, $url ); curl_setopt( $ch, CURLOPT_HTTPHEADER, array( 'Authorization: Bearer ' . $result['access_token'] ) ); curl_setopt( $ch, CURLOPT_RETURNTRANSFER,1 ); curl_setopt( $ch, CURLOPT_CONNECTTIMEOUT, $timeout ); $data = curl_exec( $ch ); curl_close( $ch ); $mostViewedRaw = json_decode( $data, true ); var_dump( $mostViewedRaw ); } ?>
From here, it is up to you to use the returned data in ways that best suit your needs and application. For example, for performance reasons and API call limitations, you might want to run the script only about once an hour, and store the results in a database table or a data structure specific to your content management system. You might also want to do some extra filtering on the Google Analytics results to only display certain types of pages (omitting the front page, for example). Note that the resulting Google Analytics data is simply output to the screen in our example so that you can see its structure and determine how to use it to suit your needs.