Sitecore wildcard items

,

In Sitecore, we have the option to use “wildcard” items to capture multiple URLs with a single piece of content. The goal of this blog post is to provide an overview of how this out-of-the-box wildcard resolving works in regard to URLs, paths, queries and when using Sitecore Content Search. The post is based on a clean Sitecore 10.3 XM installation.

Wildcard items are simply items named * (asterisk), and will match any item name on the same level as the wildcard item. They are used exactly like normal items and can be placed in a folder structure. More elaborate patterns like a* or a? are not supported by Sitecore.

In the post we will use this simple content tree, having a parent wildcard item, and a nested child wildcard item:

Wildcards items and URLs

With a wildcard in place, elements within a URL that do not specificly matches an item will be captured by the wildcard item instead. The resolving of an URL to a matching item happens in the ItemResolver and can be changed and extended. However, wildcard item works with the default ItemResolver, and with the content tree above we are now able to access the following URLs, even though no items named x or y exists:

URLItem
https://localhost/x/sitecore/Content/Home/*
https://localhost/x/y/sitecore/Content/Home/*/*

If we go deeper into the content tree, we will notice that the parent wildcard item will capture these URLs:

URLItem
https://localhost/x/y/z/sitecore/Content/Home/*

We will hence formulate this rule:

URLs are first evaluated with the / (slashes) represents levels of Sitecore items, and second with the / (slashes) being considered part of the item name. This means that if we had this content tree: /*/*/*, the path /x/y/z would resolve to the innermost wildcard item. However, if we only have this content tree /*/* the parent wildcard will catch the URL as if the slashes in /x/y/z where simply part of the item name.

Wildcard items and non-wildcard items

If we introduce some non-wildcard items below the parent wildcard item, we will see how the wildcard items interact with non-wildcard items:

URLItem
https://localhost/x/a /sitecore/Content/Home/*/a
https://localhost/x/b /sitecore/Content/Home/*/b
https://localhost/x/a/aa/sitecore/Content/Home/*/a/aa
https://localhost/x/b/bb/sitecore/Content/Home/*/b/bb
https://localhost/x/a/cc/sitecore/Content/Home/*/*/cc
https://localhost/x/b/cc/sitecore/Content/Home/*/*/cc
https://localhost/x/c/cc/sitecore/Content/Home/*/*/cc
https://localhost/x/a/z/sitecore/Content/Home/*
https://localhost/x/b/z/sitecore/Content/Home/*
https://localhost/x/c/z/sitecore/Content/Home/*

We will summarize these findings into this rule:

Non-wildcard items will take precedence above wildcard items no matter the sort order.

Wildcards items and content paths

When we access items via code using database.GetItem() and an content path, we are able to use the asterisk in the path, but we are also able to use other item names, and have the wildcard match them – just as we did with URLs:

PathItem
/sitecore/content/Home/*sitecore/Content/Home/*
/sitecore/content/Home/*/*sitecore/Content/Home/*/*
/sitecore/content/Home/*/*/*null
/sitecore/content/Home/x sitecore/Content/Home/*
/sitecore/content/Home/x/ysitecore/Content/Home/*/*
/sitecore/content/Home/x/y/znull

However note that the rule about the parent wildcard catching items in a “too-deep” structure does not apply when we access items via code.

And again, if we add some non-wildcard items, we see the same patterns of item resolving:

PathItem
/sitecore/Content/Home/x/a /sitecore/Content/Home/*/a
/sitecore/Content/Home/x/b /sitecore/Content/Home/*/b
/sitecore/Content/Home/x/a/aa/sitecore/Content/Home/*/a/aa
/sitecore/Content/Home/x/b/bb/sitecore/Content/Home/*/b/bb
/sitecore/Content/Home/x/a/cc/sitecore/Content/Home/*/*/cc
/sitecore/Content/Home/x/b/cc/sitecore/Content/Home/*/*/cc
/sitecore/Content/Home/x/c/cc/sitecore/Content/Home/*/*/cc
/sitecore/Content/Home/x/a/znull
/sitecore/Content/Home/x/b/znull
/sitecore/Content/Home/x/c/znull


Again, we see that the rule about the parent wildcard catching items in a too-deep content tree does not apply when we access items via code. It might in fact be a bug in the ItemResolver, and we should probably avoid relying on the fact that too-deep URLs are being matched by wildcard items.

Wildcards items and queries

Sitecore queries have (as far as I know) only limited support for wildcard items.

To illustrate this, let us look at this content tree, with two wildcard items (one parent and one child) as well as two non-wildcard items:

Obviously, the query /sitecore/content/Home/* will return both “outer” items, as the wildcard in queries have a special meaning, matching all items. The same happens when accessing all descendants. E.g., the query /sitecore/content/Home//* will return all four items.

Also providing non-existing item name (e.g., /sitecore/content/Home/x) will not return the wildcard item as it does when resolving URLs and content paths.

If we want to target the wildcard items specifically, escaping the wildcard (/sitecore/content/Home/#*#) will not work. We can however use the key attribute, which contains the item name in lower case (* in case of wildcard items). This allows us to e.g. get all wildcard descendants using this query: /sitecore/content/Home//*[@@key = '*']

Wildcards in Content Search

When using Sitecore Content Search with SOLR, we will quickly run into problems, as the asterisk is already a reserved character in SOLR.

If we take with the content tree from above:

We can observer that the following predicates works as expected:

var allItems =
    context.GetQueryable<SearchResultItem>().
    Where(x => x.Path.StartsWith("/sitecore/content/Home")):

var parentWildcardItem =
    context.GetQueryable<SearchResultItem>().
    Where(x => x.Path.Equals("/sitecore/content/Home/*")); 

However, if we e.g. want to get all wildcards items this predicate will not work:

var wildcardItems =
    context.GetQueryable().
    Where(x => x.Name.Equals("*"));

Even when using escape characters like e.g., \ (the normal SOLR escape character), #*# or even "*" the search will return all items, and not only the wildcard ones.

I have not dived into the Content Search code, but this indicates that the asterisk reaches SOLR unencoded. If queries like this is needed, one approach would be to introduce a computed field indicating whether an item is a wildcard item or not, thereby avoiding the asterisk in the predicates.

Summary

As we have seen wildcard be a powerful tool, but do involves some pitfalls as they are handled different in URLs, paths and queries, and only have limited support in Content Search. I would advise not to create to complex content trees using wildcards as the combination of nested wildcard items and non-wildcard items on different levels might result in slightly different behaviours across the different ways Sitecore allow us to access our content.