Inyandikorugero:Search link

Kubijyanye na Wikipedia

  1. Some users' default search domain is all namespaces. In cases like the bare regex search, the search engine protects itself by limiting all regex searches. A bare regex that crawls through millions of pages can take over twenty seconds, and may even cost you an HTML timeout. During that time very few other regex searches are allowed. Always use a filter with regex.
  2. Searching for an equals sign requires using a regexp. As with any template, use {{=}} or |1= to pass in an equals sign to any parameter, even the link label.
  3. Advancing editors who begin to search for Wikipedia's other pages may at times set their default search domain (at Special:Search Advanced) to all. Setting search to all is the most likely scenario to "set and forget". Since that includes article space, the usual results are comparable.
  4. Unlike other data that score a page ranking, word frequency and location data can be kept updated in the index at all times. For each word on the wiki, the index stores a list of page names where that word can be found. Along with page name, the word's locations and count are also stored. Apache Lucene is the indexer, and it maintains the data; it uses the term frequency algorithm. For how it does this, see TFIDF Similarity.
  5. Unlike for search indexes, page-ranking data is not immediately updated. When the number of incoming links has changed more than 20%, then it is updated.
  6. {{search link}} always produces fully specified queries, even if no namespaces is given, because it defaults to article space.
  7. A phrase will extend over whitespace unless it contains a bullet. A phrase can extend over an ordered list item, but not an unordered list item. In other words it can extend over a number # sign, but not an asterisk * character. The asterisk has special meaning to the analyzer. It is used to make an item in an unordered list, plus it is used as a modifier in search.
  8. See the ElasticSearch "tokenizer" that CirrusSearch developed.
  9. Stemming, like page ranking, is just a computer algorithm, and prone to needing occasional adjustments.
  10. CirrusSearch uses kstem for the stemmer package, per T56022.
  11. You can equally well use the insource parameter to turn stemming off. Also, please note that T113838 details this related bug: when stemming is turned off for a word the pages listed in the search results are correct, (they don't have stemmed-only variants, they all have the word as given) but any stems in the snipped are, incorrectly, highlighted.
  12. This can't be proven in an example search of this page, but it will work on another page not containing this example. This because the match, showing in bold as proof here, prefers the proper order. It can be proved by put the target text on another page, then changing the query (on the search results page) initiate here to that page.
  13. The search namespace matches in the first parameter of a query. This is consistent with its usage in navigation, wikilinking, transclusion, and page naming, where it is always the first word in the field.
  14. To see all namespaces go to the search results page and click on Advanced. The default namespace shows in parenthesis.
  15. The full text of every word on the wiki plus every word in every uploaded attachment, is all indexed together in a search database. CirrusSearch can parse and index thousands of formats.
  16. Characters not allowed in pagenames are # < > [ ] | { }.
  17. Always check the search bar for its indication. Activating the Advanced pane can show the default search domain, and the search box is very obvious with a namespace or prefix term. One way to do this is to click on the search bar search domain instead of clicking on the search button. The only time this does not work is when changing search domains in the Advanced tab: after you change them you must press Search, not Advanced.
  18. To get deepcat as a search parameter install a gadget which automatically produces incategory:pagename1|pagename2|...|pagename70. To see the number of subcategories to see if there was more or less than 69, either go fwd and bwd in the browser history, or see the source HTML of the search results page, the <title> attribute
  19. In computing it is common to delimit a /regular expression/ with slashes.
  20. The search is not actually done page by page, but the index for the wiki is built page by page in this way.
  21. By doing things like adding a Mozart navigation template to each page about Mozart [[wp:wikignomes|]] shore up the wiki infrastructure. Authorship, on the other hand, writes the prose of a page, one page at a time. (You cannot remove the unwanted links with -hastemplate:"Wolfgang Amadeus Mozart".
  22. A system message is the value of a MediaWiki operations variable. It can consist of a snippet of plain text, wiki text, CSS, or Javascript. A message is used to customize the behavior of MediaWiki, especially as pertains to the user interface as seen by readers, but also including the way it itself appears as a simple message, and these for each language and locale.