Counting and fetching them in bulk.
If you work with a large number of URLs and need to do advanced analysis on them, you need to be able to divide them into their specific segments.
Below, I will show you a few ways to work with URL-segments, particularly when dealing with a large number of them.
You can't really directly count the number of occurrences of a specific character (like a slash) in a string (like a URL) with Google Sheets. BUT, there is a workaround:
P.s.: I'm assuming you have already found a way to subtract the domain / host from the URL, and have put the paths of the URls in a separate column.
So, if you have a path in cell A1
, you would:
LEN(A1)
SUBSTITUTE()
function, we can replace slashes with nothing, and then count the length of what remains: (SUBSTITUTE(A1,"/",""))
A1
The whole formula is:
=LEN(A1)-LEN(SUBSTITUTE(A1,"/",""))
Translated: return the length of what's in cell A1, subtract from it the length of cell A1 but with all "/"
stripped from it (substituted by ""
)
Tip: If you want to apply this formula to many thousands of paths without copying the formula every time, use ARRAYFORMULA()
with ISBLANK()
: =ARRAYFORMULA(if(ISBLANK(A:A),"", LEN(A:A)-LEN(SUBSTITUTE(A:A,"/",""))))
If you start searching for 'fetching Url segments' in 'JavaScript' or 'Google Sheets', you will mostly find quite specific regular expressions. Didn't work for me. I had a list of many thousands of URLs/paths, and they were all different. Also, I needed to know all segments of all of them, not just a single segment of a single path.
So, what I did, was to count the number of segments of all of them, and use the MAX()
function to find out how many segments the largest one had (MAX()
returns the largest number in a range, so I applied it to all segments counts). If I knew that, I knew how many columns I needed to reserve for my next function.
SPLIT()
does it what it says: it splits a string into as many parts as it needs, divided by a separator you specify. So if you have a path in cell A1
, like /segment1/segment2/page.html
, and you use something like SPLIT(A1,"/")
in cell B1
, the following will happen:
B1
holds the value segment1
C1
holds the value segment2
D1
holds the value page.html
Again, if you want to use this exact function on all paths, put it in a new column (B1
), and wrap it in ISBLANK()
and ARRAYFORMULA()
:
=ARRAYFORMULA(if(ISBLANK(A:A),"", SPLIT(A:A,"/")))
After that, you can for example use pivot tables to group the segments and start analysing by segment.
If you think this was useful to you, it might be for others also. Feel free to share: