JPA性能调优简介

（主要参考Hibernate和EclipseLink的文档。）

Fetching strategies

对于entity的association到底应该lazy fetch还是eager fetch。需要为不同的场景指定合适的fetch策略。不该eager的时候eager fetch费内存；不该lazy的时候lazy fetch会增加和数据库间的通信次数。

The Hibernate recommendation is to statically mark all associations lazy and to use dynamic fetching strategies for eagerness. This is unfortunately at odds with the JPA specification which defines that all one-to-one and many-to-one associations should be eagerly fetched by default. Hibernate, as a JPA provider, honors that default.

Hibernate推荐把association的fetch策略先定义为lazy的（statically，比如@ManyToOne(fetch = FetchType.LAZY)），然后在需要eager fetch的时候再动态地指定fetch策略。

比如可以通过JPQL或者Criteria以Fetch Join的方式来eager fetch，

SELECT mag FROM Magazine mag LEFT JOIN FETCH mag.articles WHERE mag.id = 1
-- 如果不想返回重复结果，可以加上DISTINCT

builder.createQuery(Employee.class).from(Employee.class).fetch("projects", JoinType.LEFT)...

通过entity graph（@NamedEntityGraph）或者Hibernate profile（@FetchProfile）可以为不同的场景指定fetch的策略。

另外，在lazy fetch的情况下，还可以通过指定batch size来减少fetch的次数。

Pagination

如果结果集太大，则考虑通过setFirstResult()和setMaxResults()以“分页”的方式来返回结果集。

Session batching

大量的entity如果以一次性的方式来提交，会有很多缺点。一是大量对象占用内存（包括增加L1 cache的内存消耗）；二是提交时，长时间占用数据库链接。

txn.begin();

for ( int i = 0; i < 100_000; i++ ) {
	Person Person = new Person( String.format( "Person %d", i ) );
	entityManager.persist( Person );
}

txn.commit();

通过定期地调用flush()和clear()，可以控制L1 cache的大小和内存消耗。

int batchSize = 25;

for ( int i = 0; i < entityCount; ++i ) {
	Person Person = new Person( String.format( "Person %d", i ) );
	entityManager.persist( Person );

	if ( i % batchSize == 0 ) {
		//flush a batch of inserts and release memory
		entityManager.flush();
		entityManager.clear();
	}
}

JDBC batching

一般来说，JPA vendor提供了属性（比如hibernate.jdbc.batch_size，hibernate.jdbc.fetch_size）用来指定JDBC的batch size。 JDBC batching可以减少和数据库的通信次数。

To enable JDBC batching, set the hibernate.jdbc.batch_size property to an integer between 10 and 50.

JPQL Constructor Expressions

可以通过JPQL Constructor Expressions来query部分column，避免所有column的数据都被映射到内存对象中。

SELECT NEW com.company.PublisherInfo(pub.id, pub.revenue, mag.price)
    FROM Publisher pub JOIN pub.magazines mag WHERE mag.price > 5.00

可以用来构造entity对象（会处于“new”状态）或者DTO对象。

If an entity class name is specified in the SELECT NEW clause, the resulting entity instances are in the new state.

Read-Only Objects

在EclipseLink里指定类是只读的，可以避免不必要的性能损失。

myUnitofWork.addReadOnlyClass(B.class);

Sequence Allocation Size

不要使用@GeneratedValue(strategy=GenerationType.IDENTITY)的方式来生成主键。使用IDENTITY时，Hibernate会禁用JDBC batch。

使用Sequence时，指定合适的allocationSize值。如果allocationSize＝1，那么每次persist新的entity都会向数据库请求sequence的next value。

@SequenceGenerator(name="seq", initialValue=1, allocationSize=50)

Native Query

如果JPA vendor生成的SQL不能满足要求，可以考虑使用native query。可以通过entityManager.createNativeQuery()或者@NamedNativeQuery使用native query。

Second Level Cache

打开L2 cache。先阅读文档，比如Hibernate默认不推荐打开L2 cache，

By default, entities are not part of the second level cache and we recommend you to stick to this setting.

JPA性能调优简介

Fetching strategies

Session batching

JDBC batching

JPQL Constructor Expressions

Read-Only Objects

Sequence Allocation Size

Native Query

Second Level Cache

Disable SQL Debugging

Performance Monitoring

参考文档

Related Posts