1. 直接子节点 :
.contents
.children
属性
1.1 .content
tag 的 .content 属性可以将tag的子节点以列表的方式输出
print soup.head.contents
#[The Dormouse's story]
输出方式为列表,我们可以用列表索引来获取它的某一个元素
print soup.head.contents[0]
#The Dormouse's story
1.2 .children
它返回的不是一个 list,不过我们可以通过遍历获取所有子节点。
我们打印输出 .children 看一下,可以发现它是一个 list 生成器对象
print soup.head.children
#
for child in soup.body.children:
print child
结果:
The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
2. 所有子孙节点: .descendants
属性
.contents
和 .children
属性仅包含tag的直接子节点,.descendants
属性可以对所有tag的子孙节点进行递归循环,和 children类似,我们也需要遍历获取其中的内容。
for child in soup.descendants:
print child
运行结果:
The Dormouse's story
The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
The Dormouse's story
The Dormouse's story
The Dormouse's story
The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
...
The Dormouse's story
The Dormouse's story
The Dormouse's story
Once upon a time there were three little sisters; and their names were
,
Lacie and
Tillie;
and they lived at the bottom of a well.
Once upon a time there were three little sisters; and their names were
Elsie
,
Lacie
Lacie
and
Tillie
Tillie
;
and they lived at the bottom of a well.
...
...
3. 节点内容: .string
属性
如果tag只有一个 NavigableString 类型子节点,那么这个tag可以使用 .string 得到子节点。如果一个tag仅有一个子节点,那么这个tag也可以使用 .string 方法,输出结果与当前唯一子节点的 .string 结果相同。
通俗点说就是:如果一个标签里面没有标签了,那么 .string 就会返回标签里面的内容。如果标签里面只有唯一的一个标签了,那么 .string 也会返回最里面的内容。例如:
print soup.head.string
#The Dormouse's story
print soup.title.string
#The Dormouse's story