mysql - Very large table JOIN with GROUP BY -
i need combine information 27 millions lines table 7 millions lines table , filtering.
create table event_participation ( place_id int(4), person_id varchar(12), event_id varchar(10), event_description varchar(230), .... , more fields specific participation ) engine=innodb default charset=utf8; create index idx_1 on event_participation (place_id); create index idx_2 on event_participation (person_id); create index idx_3 on event_participation (event_id); create table person ( person_id varchar(12), last_name varchar(25), first_name varchar(20), middle_name varchar(20), person_attr1 varchar(20), ... person_attr50 varchar(20), ) engine=innodb default charset=utf8; create index idx_10 on person (person_id); create index idx_11 on person (person_attr1); create index idx_12 on person (person_attr2); ...
i have indexes on attributes used in queries.
there 27 millions lines in event_participation table , 7 millions lines in table person.
i need run queries this:
select person.last_name, person.first_name event_participation left join person on event_participation.person_id = person.person_id event_id in ("event 1", "event 2", "event 3", "event 4", "event 5", "event 6", "event 7") , person.person_attr1 = 'a' , person.person_attr2 = 'b' , place_id = 90 group event_participation.person_id having count(event_id) >= 3
explain is:
*************************** 1. row *************************** id: 1 select_type: simple table: event_participation type: ref possible_keys: person_id,event_id,place_id key: place_id key_len: 5 ref: const rows: 6437170 extra: using where; using temporary; using filesort *************************** 2. row *************************** id: 1 select_type: simple table: person type: ref possible_keys: person_id,person_attr1,person_attr2 key: person_id key_len: 39 ref: event_participation.person_id rows: 1 extra: using
i looking active persons participate @ least 3 events list , fulfill other criteria. fix criteria related events , run multiple queries varying person attributes.
even small subset query slow looked alternative approaches. create cache table:
create table temp_name ( person_id varchar(12), primary key (person_id) ) engine=innodb default charset=utf8;
run query like:
insert temp_name ( temp_name ) select distinct event_participation.person_id event_participation event_id in ("event 1", "event 2", "event 3", "event 4", "event 5", "event 6", "event 7") , place_id = 90 group event_participation.person_id having count(event_id) >= 3
then run filter queries like:
select person.last_name, person.first_name temp_name left join person on temp_name.person_id = person.person_id person.person_attr1 = 'a' , person.person_attr2 = 'b'
while can live current performance final queries, creation , management of temporary tables killing me. suggestions highly appreciated.
could try out:
select person.last_name, person.first_name person pers inner join (select person_id, count(*) count event_participation event_id in ("event 1", "event 2", "event 3", "event 4", "event 5", "event 6", "event 7") , person.person_attr1 = 'a' , person.person_attr1 = 'b' , place_id = 90 group person_id ) event_count on event_count.person_id = pers.person_id , event_count.count>2
update: forgot ad group inner query.
Comments
Post a Comment