承接HDP学习–HDFS Storage(上)
八、 HDFS TrashTrash相当于回收站,暂时的将删除的文件和目录移动到/.Trash/Current, 当文件被其他用户删除, 方便恢复。删除的文件保存在Trash directory 例如: 删除 /user/steve/dir1/fileA 将重建在Trash directory:
/user/steve/.Trash/Current/user/steve/dir1/fileA
但也是有限制: 如果你使用 HDFS Shell or the Ambari Files View删除文件, 是会保护的; 如果使用Java API, WebHDFS, the HDFS NFS Gateway, or HUE删除的文件,是不被保护的。 Ttrash有两个属性决定:
Set In core-default.xml
property:
fs.trash.checkpoint.interval
Determines how often the NameNode should checkpoint the .Trash directory.
0 means use the value set in fs.trash.interval
Set In core-site.xml
property:
fs.trash.interval
Determines how often checkpoints in the .Trash directory should be removed.
A value of 0 disables trash.
The HDP default value is 360 minutes.
HDFS Shell -rm 命令包含一个 参数:
-skip Trash 相当于Windows中的永久删除, 步移动回收站
九、 HDFS Trash Operation
下图是Trash的流程: 解释: The fs.trash.checkpoint.interval determines the number of minutes between trash checkpoints. If zero, the value is set to the value of fs.trash.interval. Zero is the HDP default. The number for fs.trash.checkpoint.interval should be smaller than or equal to fs.trash.interval.
Every time the checkpointer runs, it renames the .Trash/Current directory to a new numeric name. For example, .Trash/Current could be renamed to .Trash/150518175000. When new files or directories are deleted, HDFS creates a new .Trash/Current directory to hold them.
How long the older and now renamed checkpoint directory—with its deleted files and directories—is retained is determined by the fs.trash.interval property in core-site.xml. It determines the number of minutes after which the checkpoint directory gets deleted. If zero, the trash feature is disabled. The HDP default is 360 minutes. It is important to note that it is not the individual files and directories that are older that the fs.trash.interval that are deleted, but it is the checkpoint directory that is older than the fs.trash.interval that is deleted.
The fs.trash.interval may be configured both on the server and the client. If trash is disabled on the server side then the client side configuration is checked. If trash is enabled on the server side then the value configured on the server is used and the client configuration value is ignored.
十、 Overriding HDFS Default Properties