mysql - Very large table JOIN with GROUP BY -


i need combine information 27 millions lines table 7 millions lines table , filtering.

create table event_participation (     place_id                    int(4),     person_id                   varchar(12),     event_id                    varchar(10),     event_description           varchar(230),       .... , more fields specific participation  ) engine=innodb default charset=utf8; create index idx_1 on event_participation (place_id); create index idx_2 on event_participation (person_id); create index idx_3 on event_participation (event_id);  create table person (     person_id                   varchar(12),     last_name                   varchar(25),     first_name                  varchar(20),     middle_name                 varchar(20),      person_attr1                varchar(20),     ...     person_attr50              varchar(20),  ) engine=innodb default charset=utf8; create index idx_10 on person (person_id); create index idx_11 on person (person_attr1); create index idx_12 on person (person_attr2); ... 

i have indexes on attributes used in queries.

there 27 millions lines in event_participation table , 7 millions lines in table person.

i need run queries this:

select   person.last_name, person.first_name     event_participation      left join person on event_participation.person_id = person.person_id    event_id in ("event 1", "event 2", "event 3", "event 4",          "event 5", "event 6", "event 7") , person.person_attr1 = 'a' ,          person.person_attr2 = 'b' , place_id = 90 group event_participation.person_id having   count(event_id) >= 3 

explain is:

*************************** 1. row ***************************            id: 1   select_type: simple         table: event_participation          type: ref possible_keys: person_id,event_id,place_id           key: place_id       key_len: 5           ref: const          rows: 6437170         extra: using where; using temporary; using filesort *************************** 2. row ***************************            id: 1   select_type: simple         table: person          type: ref possible_keys: person_id,person_attr1,person_attr2            key: person_id       key_len: 39           ref: event_participation.person_id          rows: 1         extra: using 

i looking active persons participate @ least 3 events list , fulfill other criteria. fix criteria related events , run multiple queries varying person attributes.

even small subset query slow looked alternative approaches. create cache table:

create table temp_name (     person_id                   varchar(12),     primary key (person_id) ) engine=innodb default charset=utf8; 

run query like:

insert temp_name (  temp_name )      select   distinct event_participation.person_id         event_participation        event_id in ("event 1", "event 2", "event 3", "event 4",              "event 5", "event 6", "event 7") , place_id = 90     group event_participation.person_id     having   count(event_id) >= 3 

then run filter queries like:

select person.last_name, person.first_name temp_name left join person on temp_name.person_id = person.person_id person.person_attr1 = 'a' , person.person_attr2 = 'b' 

while can live current performance final queries, creation , management of temporary tables killing me. suggestions highly appreciated.

could try out:

select   person.last_name, person.first_name person pers inner join (select person_id, count(*) count     event_participation     event_id in ("event 1", "event 2", "event 3", "event 4",          "event 5", "event 6", "event 7") , person.person_attr1 = 'a' ,          person.person_attr1 = 'b' , place_id = 90 group person_id ) event_count on event_count.person_id = pers.person_id , event_count.count>2 

update: forgot ad group inner query.


Comments

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -