How to segment the path of a URL into its parts. A simple MySQL function to aid you when doing website analysis.

Let's say you have a database table with a lot of URLs in it, and you need to segment the URLs for analysis. What you need, is a way to isolate the segments inside a URL.

To achieve this, we can use a single MySQL-function two times. First, we need to return the part of the URL that ends with the segment we need. And next, we strip everything before that specific segment.

Here is how to fetch the Nth segment in a URL in MySQL:

  1. First, use SUBSTRING_INDEX() to divide a string and return the part before a specified number of occurences of a delimiter. In other words: return everything up until the segment we need
  2. Second, we strip everything before that segment.
  3. Et voila! We have our segment. Continue reading for an explanation and examples.

Basically the formula is this:


Where DELIMITER is a slash /, and N is the number of the segment you want returned. It first fetches everything before a specified slash. Then takes that result, and returns everything after the last slash (for this we use the negative number: -1).

If I would do this for the URL of this very article, which is, the first segment should be blog, and the second shoud be fetch-nth-segment-url:

-- 'blog'
-- 'fetch-nth-segment-url

Example 1: SELECT

To test it, I made a table called 'test', and within it a field called URL, with the URL of this article in it:

My Sql 5 7 17 Local Meuk Test And Site

Next, I executed the following query:

SUBSTRING_INDEX( SUBSTRING_INDEX( `URL`, '/', 4 ), '/', -1 ) AS segment1 ,
SUBSTRING_INDEX( SUBSTRING_INDEX( `URL`, '/', 5 ), '/', -1 ) AS segment2 
FROM `test`;

And you can see the output:

My Sql 5 7 17 Local Meuk Test

Example 2: UPDATE

Another way is to update a database with the values of every segment. This is quite handy if you're dealing with a large number of URLs, for instance an export of an entire website.

First, add a few fields to your table (another one, that holds the values of all URLs. Let's call it 'WebsiteX') to hold the values of the segments. Call them 'segment1', 'segment2', etc.

Then, update the values of the new fields with the segments:

segment1 = SUBSTRING_INDEX( SUBSTRING_INDEX( `URL`, '/', 4 ), '/', -1 )
,segment2 = SUBSTRING_INDEX( SUBSTRING_INDEX( `URL`, '/', 5 ), '/', -1 )
,segment3 = SUBSTRING_INDEX( SUBSTRING_INDEX( `URL`, '/', 6 ), '/', -1 );

This way, it is far easier to segment all URLs of a website for further analysis.