您现在的位置是：首页 > 后端

当前栏目

终于，我读懂了所有Java集合——set篇

JAVA set 集合所有终于读懂

2023-09-27 14:25:55 时间

HashSet

（底层是HashMap）

Set不允许元素重复。

基于HashMap实现，无容量限制。

是非线程安全的。

成员变量

private transient HashMap<E,Object> map;

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

构造方法

/**
 * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
 * default initial capacity (16) and load factor (0.75).
 */
public HashSet() {
    map = new HashMap<>();
}
public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}
public HashSet(int initialCapacity, float loadFactor) {
    map = new HashMap<>(initialCapacity, loadFactor);
}

添加

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

删除

public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
}

遍历

public Iterator<E> iterator() {
    return map.keySet().iterator();
}

包含

public boolean contains(Object o) {
    return map.containsKey(o);
}

TreeSet

（底层是TreeMap）

基于TreeMap实现，支持排序（自然排序 或者 根据创建TreeSet 时提供的 Comparator 进行排序）。

是非线程安全的。

成员变量

/**
* The backing map.
*/
private transient NavigableMap<E,Object> m;

// Dummy value to associate with an Object in the backing Map
private static final Object PRESENT = new Object();

构造方法

public TreeSet() {
    this(new TreeMap<E,Object>());
}



public TreeSet(Comparator<? super E> comparator) {
    this(new TreeMap<>(comparator));
}

添加

public boolean add(E e) {
    return m.put(e, PRESENT)==null;
}

删除

public boolean remove(Object o) {
    return m.remove(o)==PRESENT;
}

遍历

public Iterator<E> iterator() {
    return m.navigableKeySet().iterator();
}

包含

public boolean contains(Object o) {
    return m.containsKey(o);
}

获取开头

public E first() {
    return m.firstKey();
}

获取结尾

public E last() {
    return m.lastKey();
}

子集

public NavigableSet<E> subSet(E fromElement, boolean fromInclusive,
                              E toElement,   boolean toInclusive) {
    return new TreeSet<>(m.subMap(fromElement, fromInclusive,
                                   toElement,   toInclusive));
}

默认是含头不含尾

public SortedSet<E> subSet(E fromElement, E toElement) {
    return subSet(fromElement, true, toElement, false);
}

LinkedHashSet

（继承自HashSet，底层是LinkedHashMap）

LinkedHashSet继承自HashSet，源码更少、更简单，唯一的区别是LinkedHashSet内部使用的是LinkHashMap。这样做的意义或者好处就是LinkedHashSet中的元素顺序是可以保证的，也就是说遍历序和插入序是一致的。

类声明

public class LinkedHashSet<E>
extends HashSet<E>
implements Set<E>, Cloneable, java.io.Serializable {}

构造方法

public LinkedHashSet(int initialCapacity, float loadFactor) {
    super(initialCapacity, loadFactor, true);
}

/**
 * Constructs a new, empty linked hash set with the specified initial
 * capacity and the default load factor (0.75).
 *
 * @param   initialCapacity   the initial capacity of the LinkedHashSet
 * @throws  IllegalArgumentException if the initial capacity is less
 *              than zero
 */
public LinkedHashSet(int initialCapacity) {
    super(initialCapacity, .75f, true);
}

/**
 * Constructs a new, empty linked hash set with the default initial
 * capacity (16) and load factor (0.75).
 */
public LinkedHashSet() {
    super(16, .75f, true);
}

super指的是HashSet的default访问级别的构造方法

/**
 * Constructs a new, empty linked hash set.  (This package private
 * constructor is only used by LinkedHashSet.) The backing
 * HashMap instance is a LinkedHashMap with the specified initial
 * capacity and the specified load factor.
 *
 * @param      initialCapacity   the initial capacity of the hash map
 * @param      loadFactor        the load factor of the hash map
 * @param      dummy             ignored (distinguishes this
 *             constructor from other int, float constructor.)
 * @throws     IllegalArgumentException if the initial capacity is less
 *             than zero, or if the load factor is nonpositive
 */
HashSet(int initialCapacity, float loadFactor, boolean dummy) {
    map = new LinkedHashMap<>(initialCapacity, loadFactor);
}

BitSet

（位集，底层是long数组，用于替代List<Boolean>）

BitSet是位操作的对象，值只有0或1即false和true，内部维护了一个long数组，初始只有一个long，所以BitSet最小的size是64（8个字节64个位，可以存储64个数字），当随着存储的元素越来越多，BitSet内部会动态扩充，最终内部是由N个long来存储，这些针对操作都是透明的。

默认情况下，BitSet的所有位都是false即0。

不是线程安全的。

用1位来表示一个数据是否出现过，0为没有出现过，1表示出现过。使用的时候既可根据某一个是否为0表示，此数是否出现过。

一个1GB的空间，有8*1024*1024*1024 = 8.58*10^9bit，也就是1GB的空间可以表示85亿多个数。

常见的应用是那些需要对海量数据进行一些统计工作的时候，比如日志分析、用户数统计等等，如统计40亿个数据中没有出现的数据，将40亿个不同数据进行排序，海量数据去重等等。

JDK选择long数组作为BitSet的内部存储结构是出于性能的考虑，因为BitSet提供and和or这种操作，需要对两个BitSet中的所有bit位做and或者or，实现的时候需要遍历所有的数组元素。使用long能够使得循环的次数降到最低，所以Java选择使用long数组作为BitSet的内部存储结构。

去重示例

public static void containChars(String str) {
    BitSet used = new BitSet();
    for (int i = 0; i < str.length(); i++)
        used.set(str.charAt(i)); // set bit for char 
    StringBuilder sb = new StringBuilder();
    sb.append("[");
    int size = used.size();
    for (int i = 0; i < size; i++) {
        if (used.get(i)) {
            sb.append((char) i);
        }
    }
    sb.append("]");
    System.out.println(sb.toString());
}

public static void main(String[] args) {
    containChars("abcdfab");
}

[abcdf]

排序示例

public static void sortArray(int[] array) {

    BitSet bitSet = new BitSet(2 << 13);
    // 虽然可以自动扩容，但尽量在构造时指定估算大小,默认为64 
    System.out.println("BitSet size: " + bitSet.size());

    for (int i = 0; i < array.length; i++) {
        bitSet.set(array[i]);
    }
    //剔除重复数字后的元素个数 
    int bitLen = bitSet.cardinality();

    //进行排序，即把bit为true的元素复制到另一个数组 
    int[] orderedArray = new int[bitLen];
    int k = 0;
    for (int i = bitSet.nextSetBit(0); i >= 0; i = bitSet.nextSetBit(i + 1)) {
        orderedArray[k++] = i;
    }

    System.out.println("After ordering: ");
    for (int i = 0; i < bitLen; i++) {
        System.out.print(orderedArray[i] + "\t");
    }
}

public static void main(String[] args) {
    int[] array = new int[]{423, 700, 9999, 2323, 356, 6400, 1, 2, 3, 2, 2, 2, 2};
    sortArray(array);
}

BitSet size: 16384

After ordering:

1 2 3 356 423 700 2323 6400 9999

CopyOnWriteArraySet

（底层是CopyOnWriteArrayList）

基于CopyOnWriteArrayList实现，其唯一的不同是在add时调用的是CopyOnWriteArrayList的addIfAbsent方法。

在每次add的时候都要进行数组的遍历，因此其性能会略低于CopyOnWriteArrayList。

猜你喜欢

es6入门3--箭头函数与形参等属性的拓展
Numpy数组的形状
今天刚到货的小米平板2，就出现dnx fastboot mode一直卡死黑屏
MacOS环境下真机部署Flutter
浏览器兼容性问题及常见的解决方法
2023年全球10大必看本科必修课课程，人工智能是世界的未来，所以研究人工智能的概念是必修课（人工智能、机器学习和数据科学方向）
测开外传之数据结构与算法（Java语言描述）
SSH 出现 The authenticity of host xxx can't be established.
如何解决tensorflow报:Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
诊断Java中的内存泄露
linux下tomcat6无法显示图片验证码少了图形插件
2017iOS开发最新的打包测试步骤（亲测）
《高性能Linux服务器构建实战：系统安全、故障排查、自动化运维与集群架构》——1.7　一次Linux被入侵后的分析
H3C IP网络的结构
Scrapy 2.6 Requests and Responses 请求和响应使用指南
逃离西北？兴业太阳能亏损4470万港元出售光伏项目
腾讯Android自动化测试实战3.4　本章小结

相关主题

【Java集合】Set
（Java）堆排序
java中的Set集合
Java的锁机制
Java - 初识Java
Java对象头详解
Java搜索类
set -x 与 set +x

zl程序教程

当前栏目

终于，我读懂了所有Java集合——set篇

HashSet

TreeSet

LinkedHashSet

BitSet

CopyOnWriteArraySet

相关文章