原创

Java引用类型之软引用(1)

Java使用SoftReference来表示软引用,软引用是用来描述一些“还有用但是非必须”的对象。对于软引用关联着的对象,在JVM应用即将发生内存溢出异常之前,将会把这些软引用关联的对象列进去回收对象范围之中进行第二次回收。如果这次回收之后还是没有足够的内存,才会抛出内存溢出异常。简单来说就是:

  • 如果内存空间足够,垃圾回收器就不会回收软引用关联着的对象。
  • 如果内存空间不足,垃圾回收器在将要抛出内存溢出异常之前会回收软引用关联着的对象。
    后面会详细介绍关于内存空间的计算方式。

下面是软引用类及重要变量和方法的定义:

public class SoftReference<T> extends Reference<T> {    
    static private  long clock;  
    private long    timestamp;  
    public SoftReference(T referent) {  
        super(referent);    
        this.timestamp = clock; 
    }   
    public SoftReference(T referent, ReferenceQueue<? super T> q) {   
        super(referent, q); 
        this.timestamp = clock; 
    }   
    public T get() {    
        T o = super.get();  
        if (o != null && this.timestamp != clock)   
            this.timestamp = clock; 
        return o;   
    }   
}

类中定义了2个字段:clock和timestamp,这2个字段可以计算内存空间,进而影响到对象是否需要被回收。 clock是个静态变量,每次GC时都会将该字段设置成当前时间;timestamp字段会在调用get()方法时可能更新为当前clock的值。

HotSpot在GC时,通过调用ReferenceProcessor::process_discovered_reflist()方法来查找引用(包括软引用、弱引用、最终引用和幻引用),方法对软引用的处理逻辑如下:

ReferenceProcessorStats ReferenceProcessor::process_discovered_references(
      BoolObjectClosure*           is_alive,
      OopClosure*                  keep_alive,
      VoidClosure*                 complete_gc,
      AbstractRefProcTaskExecutor* task_executor,
      GCTimer*                     gc_timer
){

  // ... 
  _soft_ref_timestamp_clock = java_lang_ref_SoftReference::clock(); 

  // Soft references
  size_t soft_count = 0;
  {
    soft_count =  process_discovered_reflist(_discoveredSoftRefs,
                                              _current_soft_ref_policy, true,
                                              is_alive, keep_alive, complete_gc, task_executor);
  }

  update_soft_ref_master_clock();

  // 省略对其它引用的处理逻辑
}

调用的java_lang_ref_SoftReference::clock()方法的实现如下:

jlong java_lang_ref_SoftReference::clock() {
   InstanceKlass* ik = InstanceKlass::cast(SystemDictionary::SoftReference_klass());
   jlong* offset = (jlong*)ik->static_field_addr(static_clock_offset);
   return *offset;
}

address InstanceKlass::static_field_addr(int offset) {
   return (address)(offset + InstanceMirrorKlass::offset_of_static_fields() + cast_from_oop<intptr_t>(java_mirror()));
}

方法获取java.lang.ref.SoftReference类中的clock属性的值。

调用的ReferenceProcessor::update_soft_ref_master_clock()方法的实现如下:


void ReferenceProcessor::update_soft_ref_master_clock() {
  // Update (advance) the soft ref master clock field. This must be done
  // after processing the soft ref list.

  // We need a monotonically(单调地,无变化地;) non-deccreasing time in ms but
  // os::javaTimeMillis() does not guarantee monotonicity.
  jlong  now = os::javaTimeNanos() / NANOSECS_PER_MILLISEC;
  jlong  soft_ref_clock = java_lang_ref_SoftReference::clock();
  assert(soft_ref_clock == _soft_ref_timestamp_clock, "soft ref clocks out of sync");

  // The values of now and _soft_ref_timestamp_clock are set using
  // javaTimeNanos(), which is guaranteed to be monotonically
  // non-decreasing provided the underlying platform provides such
  // a time source (and it is bug free).
  // In product mode, however, protect ourselves from non-monotonicty.
  if (now > _soft_ref_timestamp_clock) {
    _soft_ref_timestamp_clock = now;
    java_lang_ref_SoftReference::set_clock(now);
  }
  // Else leave clock stalled at its old value until time progresses
  // past clock value.
}

调用process_discovered_reflist()方法继续处理软引用,方法的实现如下:

size_t ReferenceProcessor::process_discovered_reflist(
      DiscoveredList               refs_lists[],   // refs_lists就是之前提到的DiscoveredList
      ReferencePolicy*             policy,         // 只有处理软引用时才有值,其它引用传递的值为NULL
      bool                         clear_referent, // 软引用和弱引用值为true,最终引用和幻引用值为false
      BoolObjectClosure*           is_alive,
      OopClosure*                  keep_alive,
      VoidClosure*                 complete_gc,
      AbstractRefProcTaskExecutor* task_executor
){
  bool mt_processing = task_executor != NULL && _processing_is_mt;
  // If discovery used MT and a dynamic number of GC threads, then
  // the queues must be balanced for correctness if fewer than the
  // maximum number of queues were used.  The number of queue used
  // during discovery may be different than the number to be used
  // for processing so don't depend of _num_q < _max_num_q as part
  // of the test.
  bool must_balance = _discovery_is_mt;

  if (
       (mt_processing && ParallelRefProcBalancingEnabled) ||
       must_balance
  ){
    balance_queues(refs_lists);
  }

  size_t total_list_count = total_count(refs_lists);

  // Phase 1 (soft refs only):
  // . Traverse the list and remove any SoftReferences whose
  //   referents are not alive, but that should be kept alive for
  //   policy reasons. Keep alive the transitive closure of all
  //   such referents.
  if (policy != NULL) {
    if (mt_processing) {
       RefProcPhase1Task phase1(*this, refs_lists, policy, true /*marks_oops_alive*/);
       task_executor->execute(phase1);
    } else {
       for (uint i = 0; i < _max_num_q; i++) {
         process_phase1(refs_lists[i], policy,is_alive, keep_alive, complete_gc);
       }
    }
  } else { // policy == NULL
     assert(refs_lists != _discoveredSoftRefs,"Policy must be specified for soft references.");
  }

  // Phase 2:
  // . Traverse the list and remove any refs whose referents are alive.
  if (mt_processing) {
     RefProcPhase2Task phase2(*this, refs_lists, !discovery_is_atomic() /*marks_oops_alive*/);
     task_executor->execute(phase2);
  } else {
     for (uint i = 0; i < _max_num_q; i++) {
       process_phase2(refs_lists[i], is_alive, keep_alive, complete_gc);
     }
  }

  // Phase 3:
  // . Traverse the list and process referents as appropriate.
  if (mt_processing) {
     RefProcPhase3Task phase3(*this, refs_lists, clear_referent, true /*marks_oops_alive*/);
     task_executor->execute(phase3);
  } else {
     for (uint i = 0; i < _max_num_q; i++) {
       process_phase3(refs_lists[i], clear_referent,is_alive, keep_alive, complete_gc);
     }
  }

  return total_list_count;
}

分3个阶段处理引用,不过第1个阶段只针对软引用进行处理,因为只有处理软引用时,传递的policy参数的值才不会为NULL。refs_lists中存放了本次GC发现的引用类型(虚引用、软引用、弱引用等),而 process_discovered_reflist方法的作用就是将不需要被回收的对象从 refs_lists移除掉, refs_lists最后剩下的元素全是需要被回收的元素,最后会将其第一个元素赋值给之前提到过的Reference.pending字段。

当mt_processing为true时,3个阶段可以并行执行,阶段之间还是串行执行;否则阶段中的多个任务串行执行。默认mt_processing的值为false,所以我们下面只介绍串行执行的情况。

1、process_phase1()

该阶段的主要目的就是当内存足够时,将对应的SoftReference从refs_list中移除。调用的process_phase1()方法的实现如下:


// NOTE: process_phase*() are largely similar, and at a high level
// merely iterate over the extant list applying a predicate to
// each of its elements and possibly removing that element from the
// list and applying some further closures to that element.
// We should consider the possibility of replacing these
// process_phase*() methods by abstracting them into
// a single general iterator invocation that receives appropriate
// closures that accomplish this work.

// (SoftReferences only) Traverse the list and remove any SoftReferences whose
// referents are not alive, but that should be kept alive for policy reasons.
// Keep alive the transitive closure of all such referents.
void ReferenceProcessor::process_phase1(DiscoveredList&    refs_list,
                                    ReferencePolicy*   policy,
                                    BoolObjectClosure* is_alive,
                                    OopClosure*        keep_alive,
                                    VoidClosure*       complete_gc) {
  assert(policy != NULL, "Must have a non-NULL policy");
  DiscoveredListIterator iter(refs_list, keep_alive, is_alive);
  // Decide which softly reachable refs should be kept alive.
  while (iter.has_next()) {
    iter.load_ptrs(DEBUG_ONLY(!discovery_is_atomic() /* allow_null_referent */));
    bool referent_is_dead = (iter.referent() != NULL) && !iter.is_referent_alive();
    if (
          referent_is_dead && // 引用的对象referent已经不存活<br>          // 根据相关策略判断,这个不存活的对象还不应该被回收
          !policy->should_clear_reference(iter.obj(), _soft_ref_timestamp_clock)
    ){
      // Remove Reference object from list
      iter.remove();
      // Make the Reference object active again
      iter.make_active();
      // keep the referent around
      iter.make_referent_alive();
      iter.move_to_next();
    } else {
      iter.next();
    }
  }
  // Close the reachable set
  complete_gc->do_void();
}

ReferencePolicy一共有4种实现,分别为NeverClearPolicy、AlwaysClearPolicy、LRUCurrentHeapPolicy与LRUMaxHeapPolicy。常用的就是LRUCurrentHeapPolicy和LRUMaxHeapPolicy,这2个类的should_clear_reference()方法的实现相同,如下:

bool LRUMaxHeapPolicy::should_clear_reference(oop p,jlong timestamp_clock) {    
  jlong interval = timestamp_clock - java_lang_ref_SoftReference::timestamp(p); 
  assert(interval >= 0, "Sanity check"); 
  // The interval will be zero if the ref was accessed since the last scavenge/gc.  
  if(interval <= _max_interval) { 
    return false;   
  } 
  return true;  
}

timestamp_clock就是SoftReference的静态字段clock,java_lang_ref_SoftReference::timestamp(p)对应是字段timestamp。如果上次GC后有调用SoftReference类的get()方法, 那么interval值为0,否则为若干次GC之间的时间差。

_max_interval则代表了一个临界值,它的值在LRUCurrentHeapPolicy和LRUMaxHeapPolicy两种策略中有差异。

void LRUCurrentHeapPolicy::setup() {    
  _max_interval = (Universe::get_heap_free_at_last_gc() / M) * SoftRefLRUPolicyMSPerMB; 
  assert(_max_interval >= 0,"Sanity check"); 
}   

void LRUMaxHeapPolicy::setup() {    
  size_t  max_heap = MaxHeapSize;   
  max_heap -= Universe::get_heap_used_at_last_gc(); 
  max_heap /= M;    
  _max_interval = max_heap * SoftRefLRUPolicyMSPerMB;   
  assert(_max_interval >= 0,"Sanity check"); 
}

第1个方法中,SoftRefLRUPolicyMSPerMB默认为1000,其实就是1000ms/MB=1s/MB,也就是说上次GC后可用堆大小如果是10MB,那么_max_interval的值就是10s,根据should_clear_reference()方法的判断逻辑,软引用可以至少存活10s的时间。

第2个方法中,根据计算的逻辑可知,对象存储的时间与(堆的最大值大小-上次GC时堆已经使用的大小)有关。

在ReferenceProcessor::process_phase1()方法中,使用DiscoveredListIterator迭代器来遍历DiscoveredList列表,这个迭代器的实现如下:

// Iterator for the list of discovered references.
class DiscoveredListIterator {
private:
  DiscoveredList&    _refs_list;
  HeapWord*          _prev_next;
  oop                _prev;
  oop                _ref;
  HeapWord*          _discovered_addr;
  oop                _next;
  HeapWord*          _referent_addr;
  oop                _referent;
  OopClosure*        _keep_alive;
  BoolObjectClosure* _is_alive;

public:
  inline DiscoveredListIterator(DiscoveredList&    refs_list,
                                OopClosure*        keep_alive,
                                BoolObjectClosure* is_alive):
    _refs_list(refs_list),
    _prev_next(refs_list.adr_head()), // 前一个的next属性值
    _prev(NULL),
    _ref(refs_list.head()),
    _next(NULL),
    _keep_alive(keep_alive),
    _is_alive(is_alive)
{ }


  // Returns true if referent is alive.
  inline bool is_referent_alive() const {
    return _is_alive->do_object_b(_referent);
  }

  // Loads data for the current reference.
  // The "allow_null_referent" argument tells us to allow for the possibility
  // of a NULL referent in the discovered Reference object. This typically
  // happens in the case of concurrent collectors that may have done the
  // discovery concurrently, or interleaved, with mutator execution.
  void load_ptrs(DEBUG_ONLY(bool allow_null_referent));

  // Move to the next discovered reference.
  inline void next() {
    _prev_next = _discovered_addr;
    _prev = _ref;
    move_to_next();
  }


  // Make the referent alive.
  inline void make_referent_alive() {
    if (UseCompressedOops) {
      _keep_alive->do_oop((narrowOop*)_referent_addr);
    } else {
      _keep_alive->do_oop((oop*)_referent_addr);
    }
  }

  inline void move_to_next() {
    if (_ref == _next) {
      // End of the list.
      _ref = NULL;
    } else {
      _ref = _next;
    }
    assert(_ref != _first_seen, "cyclic ref_list found");
    NOT_PRODUCT(_processed++);
  }
};

其它方法的实现如下:

void DiscoveredListIterator::load_ptrs(DEBUG_ONLY(bool allow_null_referent)) {
  _discovered_addr = java_lang_ref_Reference::discovered_addr(_ref);
  oop discovered = java_lang_ref_Reference::discovered(_ref);
  assert(_discovered_addr && discovered->is_oop_or_null(),"discovered field is bad");
  _next = discovered;
  _referent_addr = java_lang_ref_Reference::referent_addr(_ref);
  _referent = java_lang_ref_Reference::referent(_ref);
  assert(Universe::heap()->is_in_reserved_or_null(_referent),"Wrong oop found in java.lang.Reference object");
  assert(allow_null_referent ?
             _referent->is_oop_or_null()
           : _referent->is_oop(),
         "bad referent");
}


void DiscoveredListIterator::remove() {
  assert(_ref->is_oop(), "Dropping a bad reference");
  oop_store_raw(_discovered_addr, NULL);

  // First _prev_next ref actually points into DiscoveredList (gross).
  oop new_next;
  if (_next == _ref) {
    // At the end of the list, we should make _prev point to itself.
    // If _ref is the first ref, then _prev_next will be in the DiscoveredList,
    // and _prev will be NULL.
    new_next = _prev;
  } else {
    new_next = _next;
  }

  if (UseCompressedOops) {
    // Remove Reference object from list.
    oopDesc::encode_store_heap_oop((narrowOop*)_prev_next, new_next);
  } else {
    // Remove Reference object from list.
    oopDesc::store_heap_oop((oop*)_prev_next, new_next);
  }
  NOT_PRODUCT(_removed++);
  _refs_list.dec_length(1);
}

// Make the Reference object active again.
void DiscoveredListIterator::make_active() {
  // For G1 we don't want to use set_next - it
  // will dirty the card for the next field of
  // the reference object and will fail
  // CT verification.
  if (UseG1GC) {
    BarrierSet* bs = oopDesc::bs();
    HeapWord* next_addr = java_lang_ref_Reference::next_addr(_ref);

    if (UseCompressedOops) {
      bs->write_ref_field_pre((narrowOop*)next_addr, NULL);
    } else {
      bs->write_ref_field_pre((oop*)next_addr, NULL);
    }
    java_lang_ref_Reference::set_next_raw(_ref, NULL);
  } else {
    java_lang_ref_Reference::set_next(_ref, NULL);
  }
}
正文到此结束