elastalk

Seeding Elasticsearch Indexes

The connect module contains a convenience function called seed that you can use to initialize Elasticsearch indexes.

Directory Structure

When you call the seed function you only need to provide a path to the directory that contains your seed data, however the directory must conform to particular structure.

And example seed data directory structure in the project is shown below.

seed
|-- config.toml
`-- indexes
    |-- cats
    |   |-- cat
    |   |   |-- 5836327c-3592-4fcb-a925-14a106bdcdab
    |   |   `-- 9b31890a-28a1-4f59-a448-1f85dd2435a3
    |   `-- mappings.json
    `-- dogs
        `-- dog
            |-- 564e74ba-1177-4d3c-9160-a08e116ad9ff
            `-- de0a76e7-ecb9-4fac-b524-622ed8c344b8

The Base Directory (“seed”)

This is the base directory that contains all the seed data. If you’re creating your own seed data set you may provide another name.

Indexes

All of the Elasticsearch indexes are defined in a subdirectory called indexes. An Elasticsearch index will be created for each subdirectory and the name of the subdirectory will be the name of the index.

Document Types

Within each index directory there are directories that define document types. The name of the subdirectory will be the name of the document type.

Documents

Within each document type directory are individual files that represent the individual documents that will be indexed. The name of the file will be the id of the document.

Extra Configuration

You can supply additional information about the seed data in an index by supplying a config.toml file in the Indexes directory.

Note

The seed function supports a parameter called config if, for some reason, you have a reason not to call your configuration files “config.toml”.

Mappings

If your index has a static mapping you can include a mappings key in the index configuration file. The value of this key should match what you would provide in the mappings if you were creating the index directly.

For example, if you would create the index by submitting the following PUT request to Elasticsearch…

PUT my_index
{
  "mappings": {
    "_doc": {
      "properties": {
        "title":    { "type": "text"  },
        "name":     { "type": "text"  },
        "age":      { "type": "integer" },
        "created":  {
          "type":   "date",
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    }
  }
}

…your configuration file should include a mappings key that looks like this…

{
  "mappings": {
    "_doc": {
      "properties": {
        "title": {
          "type": "text"
        },
        "name": {
          "type": "text"
        },
        "age": {
          "type": "integer"
        },
        "created": {
          "type": "date",
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    }
  }
}