We recently had to fix a bug with our context indexes not properly indexing the contents of any of the Office Open XML file formats that we uploaded to our database. The SQL that we ended up with was something akin to this:
BEGIN
CTX_DDL.CREATE_PREFERENCE('"CX_OBJECT_DST"', 'MULTI_COLUMN_DATASTORE');
-- DESCRIPTION and FILENAME are both VARCHAR2, OBJECT is a BLOB
CTX_DDL.SET_ATTRIBUTE('"CX_OBJECT_DST"', 'COLUMNS', 'DESCRIPTION,FILENAME,OBJECT');
CTX_DDL.SET_ATTRIBUTE('"CX_OBJECT_DST"', 'FILTER', 'N,N,Y');
END;
/
DROP INDEX SCHEMA_NAME.ATTACHMENT_OBJECT_IDX;
CREATE INDEX SCHEMA_NAME.ATTACHMENT_OBJECT_IDX
ON SCHEMA_NAME.ATTACHMENT (OBJECT)
INDEXTYPE IS CTXSYS.CONTEXT
PARAMETERS('datastore CX_OBJECT_DST')
NOPARALLEL;
ALTER INDEX SCHEMA_NAME.ATTACHMENT_OBJECT_IDX REBUILD;
This is on an Oracle 11.2.0.4 database.
At first glance, rebuilding an index immediately after it's created seems counterintuitive. But we found that if we omitted the REBUILD
, the index didn't pick up the contents of any attachments that we uploaded.
I don't understand why this would be the case (although I will be the first to admit that my knowledge in this area isn't great). What does REBUILD
do that CREATE
doesn't that causes this to work?
Whenever someone asks why we're doing the rebuild immediately after creation, all we can respond with at the moment is "because it doesn't work if we don't", which isn't a very satisfactory answer to hear (or to give for that matter)...
We have a background job that runs once per minute that calls out to a stored procedure that calls:
CTX_DDL.SYNC_INDEX('ATTACHMENT_OBJECT_IDX');
The procedure itself just includes some exception handling code and this one call - nothing that should impact on this.
We took the job offline while the index was dropped and recreated, then bought it back online after it was finished. We then left that job running for a few minutes to ensure that it wasn't failing (which it wasn't), then uploaded our .docx file to the database. We again waited until the job had run and verified that it didn't fail (again, it was fine), then attempted to search for the contents of that uploaded file, which always returned no results.
If we then do a REBUILD
on that index, the file is indexed and all new files from then on are also indexed properly. If we don't, it never seems to work (NB: We have also tried leaving the job online while the index was dropped and recreated, but didn't expect that to work - and it didn't).
CREATE INDEX
."