(Table of contents & other stuff:  )

How to fetch the Nth segment of a URL in MySQL

Let's say you have a database table with a *lot* of URLs in it, and you need to segment them for analysis. What you need, is a way to isolate the segments inside a URL.

To achieve this, we can use a built-in MySQL-function two times. First, we need to return the part of the URL that ends with the segment we need. And next, we strip everything before that specific segment. Also, we need to make sure that if a result doesn't exist, MySQL doesn't return the whole string.

Here is how to fetch the Nth segment in a URL in MySQL:

  1. 1.

    First, use SUBSTRING_INDEX() to divide a string and return the part before a specified number of occurences of a delimiter. In other words: return everything up until the segment we need

  2. 2.

    Second, we strip everything before that segment.

  3. 3.

    Also, put an IF-statement around the whole thing to check if a segment actually is there to work with. Otherwise, return an empty string.

Usage #

For example, if we have a long URL (field name: Page URL), and we want to fetch the sixth segment, we use the following formula:

/* how many segments are there? more than or equal to the segment number we want? Then there is something 'there' to work with */
IF(
	LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 9,
	/* so return its value */
	SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 9), '/', -1),
	/* or else return an empty string */
	''
) AS segment6

The following happens here:

  • -

    the IF -statement around the whole things checks how many segments there are. If it is bigger than or the same as the segment number we want, we know we can fetch something

  • -

    if TRUE, fetch the text after 'slash number N' (where N is the index of the segment)

  • -

    if FALSE, return an empty string

Some examples #

For your convenience, I've pre-made a few examples of the query for longer URLs, up to 6 segments. If you test it on the URL of this blog article, you will see that only the first two return something (which they should), and that the rest is empty:

,IF(LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 4, SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 4), '/', -1), '') AS segment1 /* 'possibly-useful-article-about' */
,IF(LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 5, SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 5), '/', -1), '') AS segment2 /* 'fetch-nth-segment-url' */
,IF(LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 6, SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 6), '/', -1), '') AS segment3 /* empty */
,IF(LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 7, SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 7), '/', -1), '') AS segment4 /* empty */
,IF(LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 8, SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 8), '/', -1), '') AS segment5 /* empty */
,IF(LENGTH(`Page URL`) - LENGTH(REPLACE(`Page URL`, '/', '')) + 1 >= 9, SUBSTRING_INDEX(SUBSTRING_INDEX(`Page URL`, '/', 9), '/', -1), '') AS segment6 /* empty */
→ Call to action ←

Want more?

Then let's close the virtual gap between us by one step.