sets
— 唯一元素的无序集合
¶
2.3 版新增。
Deprecated since version 2.6:
内置
set
/
frozenset
types replace this module.
The
sets
module provides classes for constructing and manipulating unordered collections of unique elements. Common uses include membership testing, removing duplicates from a sequence, and computing standard math operations on sets such as intersection, union, difference, and symmetric difference.
Like other collections, sets support
x in set
,
len(set)
,和
for x in
set
. Being an unordered collection, sets do not record element position or order of insertion. Accordingly, sets do not support indexing, slicing, or other sequence-like behavior.
Most set applications use the
Set
class which provides every set method except for
__hash__()
. For advanced applications requiring a hash method, the
ImmutableSet
class adds a
__hash__()
method but omits methods which alter the contents of the set. Both
Set
and
ImmutableSet
derive from
BaseSet
, an abstract class useful for determining whether something is a set:
isinstance(obj, BaseSet)
.
The set classes are implemented using dictionaries. Accordingly, the requirements for set elements are the same as those for dictionary keys; namely, that the element defines both
__eq__()
and
__hash__()
. As a result, sets cannot contain mutable elements such as lists or dictionaries. However, they can contain immutable collections such as tuples or instances of
ImmutableSet
. For convenience in implementing sets of sets, inner sets are automatically converted to immutable form, for example,
Set([Set(['dog'])])
is transformed to
Set([ImmutableSet(['dog'])])
.
sets.
Set
(
[
iterable
]
)
¶
Constructs a new empty
Set
object. If the optional
iterable
parameter is supplied, updates the set with elements obtained from iteration. All of the elements in
iterable
should be immutable or be transformable to an immutable using the protocol described in section
Protocol for automatic conversion to immutable
.
sets.
ImmutableSet
(
[
iterable
]
)
¶
Constructs a new empty
ImmutableSet
object. If the optional
iterable
parameter is supplied, updates the set with elements obtained from iteration. All of the elements in
iterable
should be immutable or be transformable to an immutable using the protocol described in section
Protocol for automatic conversion to immutable
.
因为
ImmutableSet
objects provide a
__hash__()
method, they can be used as set elements or as dictionary keys.
ImmutableSet
objects do not have methods for adding or removing elements, so all of the elements must be known when the constructor is called.
实例化的
Set
and
ImmutableSet
both provide the following operations:
|
操作 |
Equivalent |
结果 |
|---|---|---|
|
|
number of elements in set s (cardinality) |
|
|
|
test x 为成员资格在 s |
|
|
|
test x 为非成员资格在 s |
|
|
|
|
test whether every element in s 是在 t |
|
|
|
test whether every element in t 是在 s |
|
|
|
new set with elements from both s and t |
|
|
|
new set with elements common to s and t |
|
|
|
new set with elements in s but not in t |
|
|
|
new set with elements in either s or t but not both |
|
|
new set with a shallow copy of s |
注意,非运算符版本的
union()
,
intersection()
,
difference()
,和
symmetric_difference()
will accept any iterable as an argument. In contrast, their operator based counterparts require their arguments to be sets. This precludes error-prone constructions like
Set('abc') & 'cbs'
in favor of the more readable
Set('abc').intersection('cbs')
.
Changed in version 2.3.1: Formerly all arguments were required to be sets.
In addition, both
Set
and
ImmutableSet
support set to set comparisons. Two sets are equal if and only if every element of each set is contained in the other (each is a subset of the other). A set is less than another set if and only if the first set is a proper subset of the second set (is a subset, but is not equal). A set is greater than another set if and only if the first set is a proper superset of the second set (is a superset, but is not equal).
The subset and equality comparisons do not generalize to a complete ordering function. For example, any two disjoint sets are not equal and are not subsets of each other, so
all
of the following return
False
:
a<b
,
a==b
,或
a>b
. Accordingly, sets do not implement the
__cmp__()
方法。
Since sets only define partial ordering (subset relationships), the output of the
list.sort()
method is undefined for lists of sets.
The following table lists operations available in
ImmutableSet
but not found in
Set
:
|
操作 |
结果 |
|---|---|
|
|
returns a hash value for s |
The following table lists operations available in
Set
but not found in
ImmutableSet
:
|
操作 |
Equivalent |
结果 |
|---|---|---|
|
|
s |= t |
return set s with elements added from t |
|
|
s &= t |
return set s keeping only elements also found in t |
|
|
s -= t |
return set s after removing elements found in t |
|
|
s ^= t |
return set s with elements from s or t but not both |
|
|
add element x to set s |
|
|
|
remove
x
from set
s
; raises
|
|
|
|
removes x from set s if present |
|
|
|
remove and return an arbitrary element from
s
; raises
|
|
|
|
remove all elements from set s |
注意,非运算符版本的
update()
,
intersection_update()
,
difference_update()
,和
symmetric_difference_update()
will accept any iterable as an argument.
Changed in version 2.3.1: Formerly all arguments were required to be sets.
Also note, the module also includes a
union_update()
method which is an alias for
update()
. The method is included for backwards compatibility. Programmers should prefer the
update()
method because it is supported by the built-in
set()
and
frozenset()
类型。
>>> from sets import Set
>>> engineers = Set(['John', 'Jane', 'Jack', 'Janice'])
>>> programmers = Set(['Jack', 'Sam', 'Susan', 'Janice'])
>>> managers = Set(['Jane', 'Jack', 'Susan', 'Zack'])
>>> employees = engineers | programmers | managers # union
>>> engineering_management = engineers & managers # intersection
>>> fulltime_management = managers - engineers - programmers # difference
>>> engineers.add('Marvin') # add element
>>> print engineers
Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
>>> employees.issuperset(engineers) # superset test
False
>>> employees.update(engineers) # update from another set
>>> employees.issuperset(engineers)
True
>>> for group in [engineers, programmers, managers, employees]:
... group.discard('Susan') # unconditionally remove element
... print group
...
Set(['Jane', 'Marvin', 'Janice', 'John', 'Jack'])
Set(['Janice', 'Jack', 'Sam'])
Set(['Jane', 'Zack', 'Jack'])
Set(['Jack', 'Sam', 'Jane', 'Marvin', 'Janice', 'John', 'Zack'])
Sets can only contain immutable elements. For convenience, mutable
Set
objects are automatically copied to an
ImmutableSet
before being added as a set element.
The mechanism is to always add a
hashable
element, or if it is not hashable, the element is checked to see if it has an
__as_immutable__()
method which returns an immutable equivalent.
由于
Set
objects have a
__as_immutable__()
method returning an instance of
ImmutableSet
, it is possible to construct sets of sets.
A similar mechanism is needed by the
__contains__()
and
remove()
methods which need to hash an element to check for membership in a set. Those methods check an element for hashability and, if not, check for a
__as_temporarily_immutable__()
method which returns the element wrapped by a class that provides temporary methods for
__hash__()
,
__eq__()
,和
__ne__()
.
The alternate mechanism spares the need to build a separate copy of the original mutable object.
Set
objects implement the
__as_temporarily_immutable__()
method which returns the
Set
object wrapped by a new class
_TemporarilyImmutableSet
.
The two mechanisms for adding hashability are normally invisible to the user; however, a conflict can arise in a multi-threaded environment where one thread is updating a set while another has temporarily wrapped it in
_TemporarilyImmutableSet
. In other words, sets of mutable sets are not thread-safe.
set
类型
¶
内置
set
and
frozenset
types were designed based on lessons learned from the
sets
module. The key differences are:
Set
and
ImmutableSet
were renamed to
set
and
frozenset
.
There is no equivalent to
BaseSet
. Instead, use
isinstance(x,
(set,
frozenset))
.
The hash algorithm for the built-ins performs significantly better (fewer collisions) for most datasets.
The built-in versions have more space efficient pickles.
The built-in versions do not have a
union_update()
method. Instead, use the
update()
method which is equivalent.
The built-in versions do not have a
_repr(sorted=True)
method. Instead, use the built-in
repr()
and
sorted()
functions:
repr(sorted(s))
.
The built-in version does not have a protocol for automatic conversion to immutable. Many found this feature to be confusing and no one in the community reported having found real uses for it.