(Update )

How to fetch the Nth segment of a URL

How to segment the path of a URL into its parts. A simple MySQL function to aid you when doing website analysis.

Let's say you have a database table with a *lot* of URLs in it, and you need to segment the URLs for analysis. What you need, is a way to isolate the segments inside a URL.

To achieve this, we can use a single MySQL-function two times. First, we need to return the part of the URL that ends with the segment we need. And next, we strip everything before that specific segment.

Here is how to fetch the Nth segment in a URL in MySQL:

  1. First, use SUBSTRING_INDEX() to divide a string and return the part before a specified number of occurences of a delimiter. In other words: return everything up until the segment we need
  2. Second, we strip everything before that segment.
  3. Et voila! We have our segment. Continue reading for an explanation and examples.

Basically the formula is this:

SUBSTRING_INDEX(
	SUBSTRING_INDEX(URL , DELIMITER , N) ,
	DELIMITER ,
	- 1
)

Where DELIMITER is a slash /, and N is the number of the segment you want returned. It first fetches everything before a specified slash. Then takes that result, and returns everything after the last slash (for this we use the negative number: -1).

If I would do this for the URL of this very article, which is https://eikhart.com/blog/fetch-nth-segment-url, the first segment should be blog, and the second should be fetch-nth-segment-url:

-- 'blog'
SUBSTRING_INDEX(
	SUBSTRING_INDEX(`URL` , '/' , 4) ,
	'/' ,
	- 1
)
-- 'fetch-nth-segment-url
SUBSTRING_INDEX(
	SUBSTRING_INDEX(`URL` , '/' , 5) ,
	'/' ,
	- 1
)

Example 1: SELECT

To test it, I made a table called 'test', and within it a field called URL, with the full URL to this article in it. Next, I executed the following query:

SELECT
	SUBSTRING_INDEX(
		SUBSTRING_INDEX(`URL` , '/' , 4) ,
		'/' ,
		- 1
	) AS segment1 ,
	SUBSTRING_INDEX(
		SUBSTRING_INDEX(`URL` , '/' , 5) ,
		'/' ,
		- 1
	) AS segment2
FROM
	`test`;

The result being, that segment1 holds the value 'blog', and segment2 holds the value 'fetch-nth-segment-url', which was the intention after all ;)

Example 2: UPDATE

Another way is to update a database with the values of every segment. This is quite handy if you're dealing with a large number of URLs, for instance an export of an entire website.

First, add a few fields to your table to hold the values of the segments. Call them 'segment1', 'segment2', etc.

Then, update the values of the new fields with the segments:

UPDATE `Website`
SET segment1 = SUBSTRING_INDEX(
	SUBSTRING_INDEX(`URL` , '/' , 4) ,
	'/' ,
	- 1
) ,
 segment2 = SUBSTRING_INDEX(
	SUBSTRING_INDEX(`URL` , '/' , 5) ,
	'/' ,
	- 1
) ,
 segment3 = SUBSTRING_INDEX(
	SUBSTRING_INDEX(`URL` , '/' , 6) ,
	'/' ,
	- 1
);

This way, it is far easier to segment all URLs of a website for further analysis.

What's next?

Not much really. But if you think this was useful to you, it might be for others also: