{kun´ēzē}
 
(Reading time: 7 - 13 minutes)
30Oct2023

“Search” vs. “Smart Search”

Information
948 hits Updated: 01 November 2023 Blog

What is Joomla’s “Smart Search” facility?

How to set up smart search on your Joomla website

Estimating the size of the finder tables

Finder space estimator

The J! CMS always had a basic search component and, with the release of Joomla! 2.5New Features in Joomla! 2.5—if anyone can remember that far back—the J! CMS has incorporated a Google-like content in­dex­ing/pre­dic­tive text feature that allows users to “guess” for what you may be looking for.  With the release of J! 4.x, the basic search component was retired leaving only the “smart” search component.

I’ve just migrated from Joomla 3 to Joomla 4.  Read all the articles concerning smart search and deleted the old Jooma 3 search package.  I followed the steps concerning the indexation of the contents with the smart search component and, suddenly, my database exceeded the limit because of a rise in 97 Mb for the indexation of more than 300 articles.  Is there a way to dodge this or am I “condemned” to buy more disk space from my provider?a forum user, Joomla forum, 27-Jun-2023

This article will attempt to answer some of the misgivings and concerns that people have about using Joomla’s “new” smart search component.  Bear in mind that there’s actually nothing new about this feature; it’s been around since 2012.  The only news here is that the old basic search component was removed from J! 4.0 in 2021.  Various attemptshttps://github.com/joomla-extensions/search to resuscitate that component seem to have been abandoned and it’s unlikely efforts will be made in future.  While some people may be disappointed that the old search component no longer exists from J! 4.x, or they may feel that a “smart” search tool is overkill for their needs, let’s look at what people need to know before they implement smart search indexing, what options impact on the indexer, or whether other (i.e. external) searching may be better.

How to set up smart search on your Joomla website

The basic setup guide for J! 4.x (and J! 5.x) is here:  Smart Search: Indexed Contentsource:  Joomla! Documentation™, retrieved 30-Oct-2023..  A more detailed guide in transitioning from J! 3.x to J! 4.x is here:  Transition your Joomla 4 website from Search to Smart Searchsource:  Buisard, O; Joomla! Community Magazine™, 20-Dec-2021..

The size of the search index is dependent on a number of factors:

  • What content you are indexing; this depends on what finder plugins you have enabled.
  • What options you want to use with the indexer (see the image at ther right of this page).
  • How many items (e.g. articles, categories, tags, news feeds, etc.) you want indexed.
  • How many words are in the articles (categories, tags, news feeds, etc.).
  • What version of Joomla you are using.

Remember the question I quoted at the start of this article:  it’s not a question about whether you have 300 articlesIn the example cited, the user stated that they had 533 articles. or whether you have 3,000 or 30,000 articles.  The size of the index is related to what you want indexed and how many words each item has.  Intuitively, of course, the more content you want indexed, the more space your database will need to store the indexed data.  So, just how “big” is really big?

Estimating the size of the finder tables

I was unable to find an authoritative guide to estimating how much space is needed to index the content in a J! website.  I would love to know if there is an online calculator available somewhere.  Therefore I decided to construct my own model to calculate the space requirements for Joomla’s smart search; the space required to store the data depends on the factors we mentioned earlier.

Using one small Joomla 5.0 website with 20 articles, the smart search statistics are shown below:

Smart search:  search for phrases disabled
Smart search:  common words filtered
Smart search:  search for phrases enabled

In this test website, there are 20 articles including one unpublished and one trashed article.  The article with the smallest word count has 3 words; the largest article has 3,103 words; the mean word count () is 573 with a standard deviation (σ) of 713.  The word count distribution across all articles is shown in the graph at the right:

Ideally it would be better to source the data using the same test site built in J! 3.10.12, migrate it to J! 4.4.0 and then migrate it to J! 5.0.0.  I did not have the time to go through this process.   Instead I chose three test sites with roughly similar characteristics:

  • One each of (a) J! 3.10.12, (b) J! 4.4.0 and (b) J! 5.0.0.
  • A similar number of articles:  (a) 23, (b) 20, and (c) 31.
  • A similar mean word count per article:  (a) 1111, (b) 573 and (c) 975.
  • For (b) and (c) whether the index option “search for phrases” enabled or disabled made any difference to the storage space.

A summary report of these three websites appears below:

ss comparisonJ!3 4 5

The data shows there is little difference in space usage between J! 4.x and J! 5.x.  While the number of “terms” (i.e. rows) in the _finder_terms table differs depending on whether the “search for phrases” option is enabled, disk storage doesn’t appear to be impacted.  The significant impact on disk storage is whether the site uses J! 3.x or a later version:  J! 3.x requires 25 _finder* tables while J! 4.x/5.x use 11 _finder* tables; J! 4.x/5.x saves 90% of the disk space formerly used in J! 3.x.

Based on this information, we may be able to extrapolate the size of the finder terms indexed depending on how many articles exist.  Note, the calculations below are very approximate!

Finder space estimator

 

So how really “big” is big?

There are other database implications as a consequence of using large database tables that this article has not addressed.  The Joomla documentation site has a “frequently asked questions” page which may be useful.  Large finder tables also exhaust the amount of memory allocated to the SQL server as well as consuming more disk space than you may have availablehttps://forum.joomla.org/viewtopic.php?t=954273.  In those situations when you cannot operate your website with J!’s smart search you will have to explore other solutions.

This article does not discuss alternative solutions but you will find some mentions in the J! forum.  In truth, I do not know what may be the practical limitation for the amount of content that you can index using J!’s smart search feature.  I don’t know if you’ll hit a barrier at 5,000 articles + 10,000 tags in 200 categories or if you’ll hit a barrier at the five hundred and first article you save or whether you’ll hit a barrier at all.  Just as every webhosting environment is different, every website is different.  I can say from personal experience, however, that J!’s smart search feature is easy to set up and easy to use.

About the author:

has worked in the information technology industry since 1971 and, since retiring from the workforce in 2007, is a website hobbyist specialising in Joomla, a former member of the Kunena project for more than 8 years, and contributor on The Joomla Forum™. The opinions expressed in this article are entirely those of the author. View his profile here.


No thoughts on ““Search” vs. “Smart Search””

User Rating: 5 / 5

Star ActiveStar ActiveStar ActiveStar ActiveStar Active
 
Trending now