java - How to speed up Hibernate batch processing and avoid OutOfMemoryException -


description

i have spring application using hibernate orm communicate database. have function populates database tables named orders, order_line , cc_xacts. relations between tables shown below:

--------            -------------- |orders|--- 1:m --->| order_line | --------            --------------    |                ------------    -------- 1:m --->| cc_xacts |                     ------------ 

so orders have one-to-many relationship cc_xacts , order_line entity.

public void populateordersandcc_xactstable() {     gregoriancalendar cal;     string[] credit_cards = {"visa", "mastercard", "discover", "amex",             "diners" };     int num_card_types = 5;     string[] ship_types = {"air", "ups", "fedex", "ship", "courier", "mail" };     int num_ship_types = 6;      string[] status_types = {"processing", "shipped", "pending", "denied" };     int num_status_types = 4;      // order variables     int o_c_id;     java.sql.timestamp o_date;     double o_sub_total;     double o_tax;     double o_total;     string o_ship_type;     java.sql.timestamp o_ship_date;     int o_bill_addr_id, o_ship_addr_id;     string o_status;      string cx_type;     int cx_num;     string cx_name;     java.sql.date cx_expiry;     string cx_auth_id;     int cx_co_id;      system.out.println( "populating orders, order_lines, cc_xacts "             + num_orders + " orders" );      system.out.print( "complete (in 10,000's): " );       ( int = 1; <= num_orders; i++ )     {         if ( % 10000 == 0 )             system.out.print( / 10000 + " " );          int num_items = getrandomint( 1, 5 );         o_c_id = getrandomint( 1, num_customers );         cal = new gregoriancalendar();         cal.add( calendar.day_of_year, -1 * getrandomint( 1, 60 ) );         o_date = new java.sql.timestamp( cal.gettime().gettime() );         o_sub_total = (double) getrandomint( 1000, 999999 ) / 100;         o_tax = o_sub_total * 0.0825;         o_total = o_sub_total + o_tax + 3.00 + num_items;         o_ship_type = ship_types[getrandomint( 0, num_ship_types - 1 )];         cal.add( calendar.day_of_year, getrandomint( 0, 7 ) );         o_ship_date = new java.sql.timestamp( cal.gettime().gettime() );          o_bill_addr_id = getrandomint( 1, 2 * num_customers );         o_ship_addr_id = getrandomint( 1, 2 * num_customers );         o_status = status_types[getrandomint( 0, num_status_types - 1 )];          orders order = new orders();          // set parameter         order.setcustomer( customerdao.findbyid( o_c_id ) );         order.setodate( new date( o_date.gettime() ) );         order.setosubtotal( o_sub_total );         order.setotax( o_tax );         order.setototal( o_total );         order.setoshiptype( o_ship_type );         order.setoshipdate( o_ship_date );         order.setaddressbyobilladdrid( addressdao.findbyid( o_bill_addr_id ) );         order.setaddressbyoshipaddrid( addressdao.findbyid( o_ship_addr_id ) );         order.setostatus( o_status );         order.setccxactses( new hashset<iccxacts>() );         order.setorderlines( new hashset<iorderline>() );         ordersdao.shrani( order );          ( int j = 1; j <= num_items; j++ )         {             int ol_id = j;             int ol_o_id = i;             int ol_i_id = getrandomint( 1, num_items );             int ol_qty = getrandomint( 1, 300 );             double ol_discount = (double) getrandomint( 0, 30 ) / 100;             string ol_comments = getrandomastring( 20, 100 );              orderline orderline = new orderline();             orderline.setitem( itemdao.findbyid( ol_i_id ) );             orderline.setolqty( ol_qty );             orderline.setoldiscount( ol_discount );             orderline.setolcomment( ol_comments );             orderline.setorders( order );              orderlinedao.shrani( orderline );             order.getorderlines().add( orderline );          }          cx_type = credit_cards[getrandomint( 0, num_card_types - 1 )];         cx_num = getrandomnstring( 16 );         cx_name = getrandomastring( 14, 30 );         cal = new gregoriancalendar();         cal.add( calendar.day_of_year, getrandomint( 10, 730 ) );         cx_expiry = new java.sql.date( cal.gettime().gettime() );         cx_auth_id = getrandomastring( 15 );         cx_co_id = getrandomint( 1, 92 );          ccxacts ccxacts = new ccxacts();          ccxacts.setcountry( countrydao.findbyid( cx_co_id ) );         ccxacts.setorders( order );         ccxacts.setcxtype( cx_type );         ccxacts.setcxnum( cx_num );         ccxacts.setcxname( cx_name );         ccxacts.setcxexpiry( cx_expiry );         ccxacts.setcxauthid( cx_auth_id );         ccxacts.setcxxactamt( o_total );         ccxacts.setcxxactdate( o_ship_date );          ccxacts.setcountry( countrydao.findbyid( cx_co_id ) );          order.getccxactses().add( ccxacts );          ccxactsdao.shrani( ccxacts );      }      system.out.println( "" ); } 

problem

the problem occurs when want populate around 250000 orders in orders entity, because doing i'm doing slow. takes lot of hours populate database many orders. run out of java heap size , outofmemoryexception.

do have suggestions how can speed , possibly not run out of java heap size?

  1. you need activate hibernate batch support, need set following hibernate properties:

    properties.put("hibernate.jdbc.batch_size", "50"); properties.put("hibernate.order_inserts", "true"); properties.put("hibernate.order_updates", "true"); properties.put("hibernate.jdbc.batch_versioned_data", "true"); 
  2. to avoid outofmemoryerror need clear current session whenever batch ready flushed:

    dointransaction(session -> {     int batchsize = batchsize();     for(int = 0; < itemscount(); i++) {          //batch insert logic          if(i % batchsize == 0 && > 0) {             session.flush();             session.clear();         }     } }); 
  3. try maintain map of entities don't have fetch these entities in every batch step:

    customerdao.findbyid( o_c_id ); countrydao.findbyid( cx_co_id ); 

    you can either fetch them @ beginning of batch process or @ least cache result in local map. each select triggers current session flush (if use auto flushing), therefore decreasing batch performance.


Comments

Popular posts from this blog

asp.net mvc - SSO between MVCForum and Umbraco7 -

Python Tkinter keyboard using bind -

ubuntu - Selenium Node Not Connecting to Hub, Not Opening Port -