JPA 101

2017 Apr 6

If there’s one thing you have to understand to successfully use JPA (Java Persistence API) it’s the concept of a Cache. Almost everything boils down to the Cache at one point or another.

Here’s a quick cheat sheet of the JPA world:

A Cache is a copy of data, copy meaning pulled from but living outside the database.

Flushing a Cache is the act of putting modified data back into the database.

A PersistenceContext is essentially a Cache. It also tends to have it’s own non-shared database connection.

An EntityManager represents a PersistenceContext (and therefore a Cache)

An EntityManagerFactory creates an EntityManager (and therefore a PersistenceContext/Cache)

本质上，ORM框架关注的是数据库里的数据和它们的缓存（内存对象）之间的交互。

文章里讲的cache主要指的是“一级缓存（first level cache）”。

jpa

书籍推荐：编程语言实现模式

2017 Apr 6

编程语言实现模式，深入浅出，阅读起来无压力。

book 编译原理

Learning Scala, First Taste

2017 Mar 11

没有基本类型，都是对象。

1.toString() //1是Int对象
1.to(10)

函数是first-class citizen。函数可以做为其它函数的参数、返回值等。

val f = (x:Int)=>x+1

风格偏脚本语言，更“DSL友好”。

1.toString //可以省略()，语句不需要“;”结尾
1 to 10    //和 1.to(10) 等效

像Ruby一样，大部分表达式都有值。 {}语句块的值是它包含的最后一句表达式的值。{}语句块有值也是定义Scala函数时不需要return语句的原因。

val s = if (x > 0) 1 else -1

强调不可变，immutable。

val a = 1 //变量a不可修改
var b = 2 //b可以被修改
val c = Map("foo"->42) //c的类型是scala.collection.immutable.Map，Scala优先使用immutable类型

可以重载操作符。

ages += ("Fred" -> 20)  //Map的插入操作
ages -= "Joshua"        //删除

更简洁，

class Person(age: Int) { // 会自动生成field，和它的getter和setter
}

highOrderFunc((x: Double) => 3 * x) 
highOrderFunc((x) => 3 * x)     // 匿名函数的参数类型可以从highOrderFunc的定义推导出来，可以省略
highOrderFunc(x => 3 * x)       // ()可以省略
highOrderFunc(3 * _)            // 可以用“_”指代参数

List(1, 2, 3).reduceLeft(_ + _) // 第一个“_”表示匿名函数的第一个参数，第二个“_”表示第二个参数

scala

JVM的invokedynamic指令

2017 Mar 8

Java Virtual Machine Support for Non-Java Languages

invokedynamic是Java 7时引入的新的JVM指令。通过invokedynamic可以在运行时再决定要调用哪个方法。有了invokedynamic，可以更方便地在JVM上实现类似Ruby的动态语言。

下面的两段话非常好地概括了invokedynamic的机制。

Each instance of an invokedynamic instruction is called a dynamic call site. A dynamic call site is originally in an unlinked state, which means that there is no method specified for the call site to invoke. As previously mentioned, a dynamic call site is linked to a method by means of a bootstrap method. A dynamic call site’s bootstrap method is a method specified by the compiler for the dynamically-typed language that is called once by the JVM to link the site. The object returned from the bootstrap method permanently determines the call site’s behavior.

The invokedynamic instruction contains a constant pool index (in the same format as for the other invoke instructions). This constant pool index references a CONSTANT_InvokeDynamic entry. This entry specifies the bootstrap method (a CONSTANT_MethodHandle entry), the name of the dynamically linked method, and the argument types and return type of the call to the dynamically linked method.

invokedynamic指令出现的地方叫做“call site”（“调用点”）。一开始call site处于“unlinked”状态，也就是还没有确定要调用哪个方法。执行invokedynamic时，JVM会先去调用一个“bootstrap method”来确定要调用的方法。

invokedynamic指令的参数里包含了当前这条invokedynamic指令的“bootstrap method”信息（通过一个constant pool index“指向”bootstrap method）。

“bootstrap method”是普通的Java方法，有固定的参数类型和返回类型。返回类型是CallSite（CallSite对象里有MethodHandle，MethodHandle实际上就对应于一个方法）。下面就是一个“bootstrap method”的声明，

public static CallSite mybsm(
    MethodHandles.Lookup callerClass, String dynMethodName, MethodType dynMethodType) {}

使用invokedynamic时，先定义好“bootstrap method”（可以多个invokedynamic指令共用一个“bootstrap method”，视情况而定）。然后再构造相关的字节码，比如invokedynamic指令对应的字节码，constant pool相关的字节码等。字节码可以通过ASM库来帮助构造，比如这个例子。

invokedynamic除了用来在JVM上实现动态语言，还用来实现Java 8里的Lambda。

更多详细信息见官网文档。

java

垃圾回收（GC）vs自动引用计数（ARC）

2017 Mar 6

Garbage Collection vs. ARC

Swift之父，Chris Lattner，关于GC（Garbage Collection）和ARC（Automatic Reference Counting）哪个“更好”的一个访谈。

GC和ARC都可以让程序员不用去关心内存管理（在一定程度上）。 GC是后台起线程去默默回收对象，ARC是通过编译器帮助插入回收对象的代码； GC在实现时常常会用到write barrier；所以两者都有一定的性能损耗。

GC和ARC都不能完全解放程序员对内存的管理。用GC，需要关心stop the world、大对象的泄漏等；用ARC需要关心循环引用等。

但Chris Lattner更推崇ARC，因为ARC给程序员一个选择，让他们（在必要的时候）可以完全地、自主地去管理内存。 Java之类的GC语言不能去除垃圾回收器。所以Swift可以做为“系统级”程序开发语言，Java不能。

programming language

Event Sourcing简介

2017 Mar 6

Event Sourcing in Practice

Event Sourcing的一个简单介绍。

简单来说，Event Sourcing可以认为是一种不同的（数据库）建模方法，以及由此带来的一系列变化。相比于“传统的”的建模方法，Event Sourcing不存储对象的当前状态，而是存储导致状态变化的一系列事件（event）；通过apply（回放）所有事件来得到对象的当前状态。

Event Sourcing的好处有：

非常强的可追溯性（traceability）。拥有状态变化的所有历史。
因为数据库的模型和内存对象的模型有着很大的区别，所以也省去了ORM的必要，以及ORM带来的一系列问题。

Event对于数据库而言是写操作，它们对于查询而言并不友好。可以利用CQRS模式设计相应的read model来满足查询的需求。

Event Sourcing适用于某些场景，比如：

accountability/debugability is critical
you need version control/undo for data (e. G. Wikis, Google Docs)
your business derives value or competitive advantage from event data
your domain is inherently event driven (e. G. basketball game tracking)

更多参考，

Event Sourcing, overcoming the monolith

arch event sourcing CQRS

API Design with Java 8

2017 Mar 5

API Design with Java 8

如果返回值可能为null，那么使用Optional类型。尽量不要把Optional做为接口参数，这会使得接口调用变得很笨拙。

public Optional<String> getComment() {
    return Optional.ofNullable(comment);
}

不要使用数组做为返回值或者参数。使用数组做为返回值时，为了防止调用者修改source数组，常常要先拷贝source数组再返回，效率不高（比如Enum::values()）。使用数组做为参数时，其它线程可能会同时在修改数组。如果需要强调返回的集合是不可修改时，可以考虑使用Stream做为返回类型。

public Stream<String> comments() {
    return Stream.of(comments);
}

可以考虑增加静态的接口方法来创建接口的实现类。相比Point point = new PointImpl(1,2);而言，Point point = Point.of(1,2);隐藏了具体的实现类。

考虑使用lambda，而不是继承，来个性化对象的行为。

//不要这样
Reader reader = new AbstractReader() {
    @Override
    public void handleError(IOException ioe) {
        ioe. printStackTrace();
    }
};

//better to expose a static method or a builder in the Reader interface that takes a Consumer<IOException> and applies it to an internal generic ReaderImpl
Reader reader = Reader.builder()
    .withErrorHandler(IOException::printStackTrace)
    .build();

给Functional Interface加上@FunctionalInterface，保证接口只有一个抽象方法。

如果接口要求参数不为null，考虑使用Objects.requireNonNull()做参数验证，可以提前抛出异常。不要过分考虑带来的性能损失，JVM会优化掉不必要的检查。

public void addToSegment(Segment segment, Point point) {
    Objects.requireNonNull(segment); //可以更早地抛出异常
    Objects.requireNonNull(point);
    segment.add(point);
}

更多详见原文。

java 8 java

Spock，又一种Java测试框架

2017 Mar 3

JUnit vs Spock + Spock Cheatsheet

Spock是用Groovy语言实现的测试框架，可以测试Groovy代码和Java代码。

比JUnit简洁，

class Math extends Specification {
    def "maximum of two numbers"(int a, int b, int c) {
        expect:
        Math.max(a, b) == c

        where:
        a | b | c
        1 | 3 | 3   //passes
        7 | 4 | 4   //fails
        0 | 0 | 0   //passes
    }
}

原生地支持Mocking和Stubbing，不再依赖Mockito之类的mock框架，

def "should send messages to all subscribers"() {
    when:
    publisher.send("hello")

    then:
    1 * subscriber.receive("hello") //subsriber should call receive with "hello" once.
    1 * subscriber2.receive("hello")
}

subscriber.receive(_) >>> ["ok", "error", "error", "ok"]

支持BDD（Behavioral Driven Development），

given: //data initialization goes here (includes creating mocks)
when: //invoke your test subject here and assign it to a variable
then: //assert data here
cleanup: //optional
where: //optional:provide parametrized data (tables or pipes)

更好的错误提示，

maximum of two numbers   FAILED

Condition not satisfied:

Math.max(a, b) == c
	|    |  |  |  |
	|    7  0  |  7
	42         false

Spock使用了JUnit的runner infrastructure，所以也支持code coverage等报告。

其它参考：

An introduction to Spock
So Why is Spock Such a Big Deal?
Comparing Spock and Junit

spock java unit test groovy

String.intern()

2017 Mar 2

String.intern in Java 6, 7 and 8 – string pooling

通过String.intern()，值相同的字符串们在JVM里可以只占用一个String对象，从而减少内存的消耗。另外程序使用过的String literal和String类型的常量也会被intern。

“intern”在这里是“拘留”的意思，

n. 实习生，实习医师
vt. 拘留，软禁
vi. 作实习医师

String.intern("foo")就是把字符串“foo”“拘留”在String类内部的一个pool里。

String pool实际上是一个固定大小的哈希表。在Java 7之后，String pool占用的是堆（heap）内存。

因为哈希表是固定大小的，不会自动扩容，所以大小的设置很重要。如果相比于要intern的字符串数目，哈希表的大小设置得过小的话，哈希表的性能就可能退化成线性搜索。

# 在已经intern了1000000个字符串的情况下，再intern 10000个字符串
# pool的大小是60013，会有很多collision，哈希表性能退化
time = 1.913 sec
# pool的大小是100003
time = 0.012 sec

和普通的HashMap不同的是，String pool里的字符串如果没有地方引用，可以被垃圾回收。从这个方面看，String pool的行为和WeakHashMap<String, WeakReference<String>>差不多。但是WeakHashMap<String, WeakReference<String>>消耗的内存是String pool的实现的5倍。

Java7u40之后String pool的默认大小是60013。可以通过-XX:StringTableSize=N选项，给pool指定合适的大小。考虑到哈希表的性能，pool的大小最好是质数，比如60013。同时为了避免冲突（collision），pool的大小可以设成比“要intern的字符串数目X2”大一点的质数。

Java 6的时代，String pool放在PermGen区，所以intern太多字符串会导致PermGen区out of memory。

另外，选项-XX:+PrintStringTableStatistics可以打印出String pool的使用情况。选项-XX:+PrintFlagsFinal可以显示String pool的大小。

java

Session Management Cheat Sheet

2017 Mar 2

Session Management Cheat Sheet

OWASP出品的关于session的非常详细的cheat sheet。重点讲了如何安全地管理session。 session做为一种和用户名&密码等效的认证手段，session管理的安全性是很重要的。

web http session reading list