Case Folding is what allows searches for pais or paîs to match país. By default accented characters won’t even be indexed by Sphinx. They’ll be considered word breaks so you would have to search for pa s to match país. I would have thought a standard latin case folding config would be standard for just about everyone using Sphinx, however a cursory Googling didn’t turn up much. The best article I found as by James Healy.
His config works, but then in the Sphinx wiki I found reference to a formatted list whose formatting and length imparted a certain air of authority.
I plugged it in to the config given by Mr. Healy and had YAML-related problems. Using a double-quoted string in YAML will generally collapse everything to one line. After removing comments from the list the generated config looked good, but then Sphinx choked on the incredibly long line length:
ERROR: line too long in /Users/dasil003/myapp/config/development.sphinx.conf line 39 col 1.
FATAL: failed to parse config file '/Users/dasil003/myapp/config/development.sphinx.conf'.
The subtleties of YAML parsing are something I’d rather not commit to memory. Instead we use experimental science. What I needed was lines formatted like this:
Thinking Sphinx Case Folding Configuration
Case Folding is what allows searches for
pais
orpaîs
to matchpaís
. By default accented characters won’t even be indexed by Sphinx. They’ll be considered word breaks so you would have to search forpa s
to matchpaís
. I would have thought a standard latin case folding config would be standard for just about everyone using Sphinx, however a cursory Googling didn’t turn up much. The best article I found as by James Healy.His config works, but then in the Sphinx wiki I found reference to a formatted list whose formatting and length imparted a certain air of authority.
I plugged it in to the config given by Mr. Healy and had YAML-related problems. Using a double-quoted string in YAML will generally collapse everything to one line. After removing comments from the list the generated config looked good, but then Sphinx choked on the incredibly long line length:
The subtleties of YAML parsing are something I’d rather not commit to memory. Instead we use experimental science. What I needed was lines formatted like this:
IRb to the rescue:
Well shucks, that ain’t so bad.
The final product which you can drop directly into your
sphinx.yml
file (for the development environment anyway):