Upgrade an Apache Solr Search index from 1.4 to 3.6 (and later versions)

Recently I had to upgrade someone's Apache Solr installation from 1.4 to 5.x (the current latest version), and for the most part, a Solr upgrade is straightforward, especially if you're doing it for a Drupal site that uses the Search API or Solr Search modules, as the solr configuration files are already upgraded for you (you just need to switch them out when you do the upgrade, making any necessary customizations).

However, I ran into the following error when I tried loading the core running Apache Solr 4.x or 5.x:

org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: MMapIndexInput(path="/var/solr/cores/[corename]/data/spellchecker2/_1m.cfx") [slice=_1m.fdx]): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later.

To fix this, you need to upgrade your index using Solr 3.5.0 or later, then you can upgrade to 4.x, then 5.x (using each version of Solr to upgrade from the previous major version):

  1. Run locate lucene-core to find your Solr installation's lucene-core.jar file. In my case, for 3.6.2, it was named lucene-core-3.6.2.jar.
  2. Find the full directory path to the Solr core's data/index.
  3. Stop Solr (so the index isn't being actively written to.
  4. Run the command to upgrade the index: java -cp /full/path/to/lucene-core-3.6.2.jar org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose /full/path/to/data/index

It will take a few seconds for a small index (hundreds of records), or a bit longer for a huge index (hundreds of thousands of records), and then once it's finished, you should be able to start Solr again using the upgraded index. Rinse and repeat for each version of Solr you need to upgrade through.

If you have directories like index, spellchecker, spellchecker1, and spellchecker2 inside your data directory, run the command over each subdirectory to make sure all indexes are updated.

For more info, see the IndexUpgrader documentation, and the Stack Overflow answer that instigated this post.