performance - Oracle perfromance joining 3 tables with sum aggregate function -
i have 3 tables terms ( id, isn,sentenceid,term_root,sentence_length) contains full corpus ( more 20 million records ) user_terms ( id,isn,sentenceid,term_roott,sentence_length) contains user document info ( 100000 records) correlations ( id, term1, term2, correlation_factor ) contains static data, describes correlation , similarity between 2 terms (contains 500000 records)
i want find similarity between user document sentence , documents in corpus. joining user term , corpus term , find correlation factor, , summing result each sentence used query
select tt.sentenceid ,tu.sentenceid, sum (c.correlation_factor)/greatest( tu.sentence_length,tt.sentence_length), tt.isn , tu.isn correlations c, terms tt, user_terms tu (tt.term_root = c.term1 , tu.term_root = c.term2 , tu.isn='22242') group tt.sentence_id, tu.sentence_id, tt.isn, tu.isn,tu.sentence_length,tt.sentence_length having sum (c.correlation_factor)/greatest( tu.sentence_length,tt.sentence_length)>0.6;
this query takes more 10 min. how can rewrite fast possible? indexes need?
edited : explain plan: | 0 | select statement|| 2670 | 219k| 3864 (1)| 00:00:47 |
|* 1 | filter || | | | |
| 2 | hash group || 2670 | 219k| 3864 (1)| 00:00:47 |
| 3 | nested loops || 53383 | 4379k| 3862 (1)| 00:00:47 |
|* 4 | hash join || 1414 | 86254 | 1032 (1)| 00:00:13 |
|* 5 | index range scan | index5| 1425 | 31350 | 8 (0)| 00:00:01 |
| 6 | index fast full scan| plag_term_correlations4_uk1 | 563k| 20m| 1020 (1)| 00:00:13 |
|* 7 | index range scan | plag_terms_index1 | 38 | 874 | 2 (0)| 00:00:01 |
predicate information (identified operation id):
1 - filter(sum("c"."correlation_factor")/greatest("tu"."sentence_length","tt"."sentence_lengt h")>0.6)
4 - access("tu"."term_root"="c"."term2")
5 - access("tu"."isn"='22242')
7 - access("tt"."term_root"="c"."term1")
Comments
Post a Comment