Home » Server Options » Text & interMedia » Abbreviations and acronyms thesaurus (11gR2)
Abbreviations and acronyms thesaurus [message #634899] Tue, 17 March 2015 17:34 Go to next message
jazztunes
Messages: 1
Registered: March 2015
Location: United States
Junior Member
Hi!

I'm trying to do searches on text that contains state names. I want users to be able to use state abbreviations and still find what they're looking for. It seems like a good use for a thesaurus. It works for most states, but it doesn't seem to work for states that have a space such as "new york".

Any thoughts or alternative approaches would be appreciated!

--DH
-------------------------------------------------------------------
--Create a table with some data containing state names
-------------------------------------------------------------------
drop TABLE test_tab;
CREATE TABLE test_tab (state VARCHAR2(60) primary key);

insert into test_tab VALUES ('deep in the heart of texas');
insert into test_tab VALUES ('I heart new york');
commit;

-------------------------------------------------------------------
--Create a THESAURUS that relates the state to it's abbreviation 
-------------------------------------------------------------------
EXEC CTX_THES.drop_THESAURUS ('STATE_ABBR');
begin
  CTX_THES.CREATE_THESAURUS ('STATE_ABBR');
  CTX_THES.CREATE_RELATION ('STATE_ABBR', 'TX', 'SYN', 'TEXAS');
  CTX_THES.CREATE_RELATION ('STATE_ABBR', 'NY', 'SYN', 'NEW YORK');
end;
/

-------------------------------------------------------------------
--Create a datastore
-------------------------------------------------------------------
exec ctx_ddl.drop_preference('state_ds');
begin
  ctx_ddl.create_preference('state_ds','MULTI_COLUMN_DATASTORE'); 
  ctx_ddl.set_attribute('state_ds', 'COLUMNS', 'state'); 
end;
/

-------------------------------------------------------------------
--Create a name group section
-------------------------------------------------------------------
exec ctx_ddl.drop_section_group('state_ng');
begin
  ctx_ddl.create_section_group('state_ng', 'BASIC_SECTION_GROUP');
  ctx_ddl.add_ndata_section('state_ng', 'state', 'state');
end;
/

-------------------------------------------------------------------
--Create the word list
-------------------------------------------------------------------
exec ctx_ddl.drop_preference('state_wl');
begin
  ctx_ddl.create_preference('state_wl', 'BASIC_WORDLIST');
  ctx_ddl.set_attribute('state_wl', 'NDATA_THESAURUS', 'STATE_ABBR');
end;
/

-------------------------------------------------------------------
--Create the index
-------------------------------------------------------------------
drop INDEX test_idx;

CREATE INDEX test_idx on test_tab (state)
  INDEXTYPE IS CTXSYS.CONTEXT
  parameters('datastore state_ds wordlist state_wl section group state_ng')
;

-------------------------------------------------------------------
--Run some tests
-------------------------------------------------------------------

--This correctly finds "texas"
select score(1) SCORE, state from test_tab
where contains(state,'ndata(state,tx)',1)>0 order by score(1) desc;

--This should find "new york", but it doesn't!
select score(1) SCORE, state from test_tab
where contains(state,'ndata(state,ny)',1)>0 order by score(1) desc;

[Updated on: Tue, 17 March 2015 20:00]

Report message to a moderator

Re: Abbreviations and acronyms thesaurus [message #634908 is a reply to message #634899] Tue, 17 March 2015 20:36 Go to previous message
Barbara Boehmer
Messages: 9077
Registered: November 2002
Location: California, USA
Senior Member
I have tested and tried a few variations, like adding an empty stoplist, and can't get it to work with a state with a space in it either. I also can't find anything in the documentation that says that synonyms in an ndata_thesaurus cannot have spaces. However, I don't see any examples with spaces either. This may be an undocumented restriction or a bug. I suggest that you re-post the question on the OTN Text sub-forum/space where it is likely to be seen by the Oracle Text product manager:

https://community.oracle.com/community/database/text/content?customTheme=otn
Previous Topic: Accent insensitive indexation
Next Topic: Find subset of string
Goto Forum:
  


Current Time: Tue Mar 19 00:16:44 CDT 2024