https://15445.courses.cs.cmu.edu/fall2022/project1/

DB的Buffer Replacement Policies中的LRU和Clock算法都容易受到sequential flooding的影响，对于LRU来讲，使用LRU-K算法能够缓解这个问题。

LRU-K

This component is responsible for tracking page usage in the buffer pool.
The LRU-K algorithm evicts a frame whose backward k-distance is maximum of all frames in the replacer.
- Backward k-distance is computed as the difference in time between current timestamp and the timestamp of kth previous access.
- A frame with less than k historical accesses is given +inf as its backward k-distance.
- When multipe frames have +inf backward k-distance, the replacer evicts the frame with the earliest timestamp.

Backward k-distance

这个数据很关键，也很容易被理解错，下面举例说明，当k=3时候

由上面的定义可知，Backward k-distance为current_timestamp和倒数第k次访问的时间之差，故k=3时候如上面所示

因为每次current_timestamp都是一样的，所以其实直接看第k次访问的时间点就好，这里可以不用记下每次访问的时间戳，直接记录访问对应的index就好了，也就说记住某个数在第几次出现就好了。

Evit

每次都选择最大的Backward k-distance驱除(evit)，这里存在两种情况:
1. 某个数已经有了k次访问记录，那么Backward k-distance就是上图所示
2. 如果不够k次访问，那么Backward k-distance就是正无穷大
那么每次evit时候肯定都是优先选择第二种情况，因为是正无穷大
- 如果有多个无穷大，则按照FIFO的顺序evit
  比如访问顺序如下（从左到右依次访问），假设k=3
  1 2 3 4 1 2 3 1 2
  这里无限大的数有3和4，按照FIFO顺序，3第一次出现的时间点和4还早，所以第一次evit时候选择3
第二种情况没有了的话，则按照第一种情况，这里直接看哪个数倒数第k次最早出现就好了，那么就是Backward k-distance最大的

SetEvictable

需要注意的是某个数是evitable才能被evit，也就是`这个函数的作用，因为此时某个数据正在被访问，不能evit

有一种情况是SetEvictable时候，但是这个数还没有被访问过或者是被evit了，那么这个函数直接return就好了

Implementation

我是使用了两个顺序容器和一个map

map记录了某个数第n次访问的index，比如hist(1, 3)，代表1这个数第3次访问的index
当某个数还没有访问k次时候，就放到第一个容器，保证容器内的顺序是FIFO的，这样每次从队首或者队尾出队就好了，因为某些数可能不是Evictable的，实际上需要遍历一下容器，找到第一个能够Evictable的。
如果达到k次，那么把这个数从第一个容器挪到第二个容器
evit时候，如果第一个容器已经是empty的，那么代表没有正无穷的数，需要从第二个容器拿，此时因为map记录了每次容器出现的时间点，按照map对容器排序后直接取好了。这里没有想到一个方法类似第一个容器那么维护状态，考虑一下访问顺序：
1
1 1 1 2 2 2 1 X
虽然1是最后evit的，但是按照Backward 3-distance来看，1才是要evit，不能每次访问就更新那个数在第二个容器的位置。

看了下discord的讨论，有比这个更快的方法，应该使用了堆之类的容器维护

维护某个数是否已经出现，以及其状态：

struct LruEntry {
    bool evictable_{false};
    size_t access_count_{0};
  };
std::unordered_map<frame_id_t, LruEntry> lru_entry_hash_;

第一个容器：

1	std::list<frame_id_t> less_than_k_frames_;

第二个容器：

1	std::vector<frame_id_t> more_than_k_frames_;

记录每个数每次出现的时间点：

struct PairHash {
    auto operator()(const std::pair<frame_id_t, size_t> &p) const -> std::size_t {
      // 使用 std::hash 计算哈希值
      std::size_t h1 = std::hash<frame_id_t>{}(p.first);
      std::size_t h2 = std::hash<size_t>{}(p.second);
      return h1 ^ (h2 << 1);  // 可以使用异或和位移等运算来组合哈希值
    }
  };
std::unordered_map<std::pair<frame_id_t, size_t>, size_t, PairHash> hist_;

获取某个数第k次出现的时间：

auto GetLastKAccessTime(const frame_id_t x) const -> size_t {
	const auto last_k = lru_entry_hash_.at(x).access_count_ - k_ + 1;
	return hist_.at(std::make_pair(x, last_k));
}

按照Backward k-distance对第二个容器排序：

1 2	std::sort(more_than_k_frames_.begin(), more_than_k_frames_.end(), [&](frame_id_t a, frame_id_t b) { return GetLastKAccessTime(a) < GetLastKAccessTime(b); });