Text Indexing

Lesson 9
Author : Afrixi
Last Updated : February, 2023
MongoDB - noSQL Database
This course covers the basics of working with MongoDB.

Text indexing is a feature in MongoDB that allows for efficient text search queries. With text indexing, MongoDB can index and search through text fields within a collection. This is especially useful for collections with large amounts of text data, such as blog posts or product descriptions.

To create a text index in MongoDB, you first need to create a collection with a text index field. You can do this by using the createIndex() method and passing in the text index field as the key and the value as "text". For example, to create a text index on a posts collection for the title and content fields, you can run the following command:

db.posts.createIndex({ title: "text", content: "text" })

Once the text index is created, you can search for text within the indexed fields by using the $text operator in the find() method. For example, to find all documents in the posts collection that contain the word “MongoDB” in either the title or content fields, you can run the following command:

db.posts.find({ $text: { $search: "MongoDB" } })

By default, MongoDB will return the documents that match the search query sorted by the relevance score of the search term. You can also specify a projection to return only specific fields from the matched documents.

Text indexing also supports advanced search features such as fuzzy matching, phrase matching, and term weighting. You can configure these features by setting options in the text index definition. For example, to enable fuzzy matching and set the minimum similarity to 0.5, you can modify the text index definition as follows:

db.posts.createIndex(
  { title: "text", content: "text" },
  { default_language: "english", language_override: "lang", textIndexVersion: 3, weights: { title: 3, content: 1 }, name: "TextIndex", sparse: true, background: true, unique: false, collation: { locale: "en_US", strength: 2 }, maxTimeMS: 0, expireAfterSeconds: null, storageEngine: null, validator: null, partialFilterExpression: null,
    "textIndexConfig": {
      "fuzzy": true,
      "fuzzyMaxEdits": 2,
      "fuzzyPrefixLength": 3,
      "fuzzyRewrite": "top_terms_1000",
      "fuzzyMaxExpansions": 100
    }
  }
)

This will enable fuzzy matching for the text index and set the maximum number of allowed edits to 2, the prefix length to 3, the rewrite method to "top_terms_1000", and the maximum number of expansions to 100.