提交 a13eea6b 编写于 作者: L Linus Torvalds

Merge tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs

Pull new F2FS filesystem from Jaegeuk Kim:
 "Introduce a new file system, Flash-Friendly File System (F2FS), to
  Linux 3.8.

  Highlights:
   - Add initial f2fs source codes
   - Fix an endian conversion bug
   - Fix build failures on random configs
   - Fix the power-off-recovery routine
   - Minor cleanup, coding style, and typos patches"

From the Kconfig help text:

  F2FS is based on Log-structured File System (LFS), which supports
  versatile "flash-friendly" features. The design has been focused on
  addressing the fundamental issues in LFS, which are snowball effect
  of wandering tree and high cleaning overhead.

  Since flash-based storages show different characteristics according to
  the internal geometry or flash memory management schemes aka FTL, F2FS
  and tools support various parameters not only for configuring on-disk
  layout, but also for selecting allocation and cleaning algorithms.

and there's an article by Neil Brown about it on lwn.net:

  http://lwn.net/Articles/518988/

* tag 'for-3.8-merge' of git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs: (36 commits)
  f2fs: fix tracking parent inode number
  f2fs: cleanup the f2fs_bio_alloc routine
  f2fs: introduce accessor to retrieve number of dentry slots
  f2fs: remove redundant call to f2fs_put_page in delete entry
  f2fs: make use of GFP_F2FS_ZERO for setting gfp_mask
  f2fs: rewrite f2fs_bio_alloc to make it simpler
  f2fs: fix a typo in f2fs documentation
  f2fs: remove unused variable
  f2fs: move error condition for mkdir at proper place
  f2fs: remove unneeded initialization
  f2fs: check read only condition before beginning write out
  f2fs: remove unneeded memset from init_once
  f2fs: show error in case of invalid mount arguments
  f2fs: fix the compiler warning for uninitialized use of variable
  f2fs: resolve build failures
  f2fs: adjust kernel coding style
  f2fs: fix endian conversion bugs reported by sparse
  f2fs: remove unneeded version.h header file from f2fs.h
  f2fs: update the f2fs document
  f2fs: update Kconfig and Makefile
  ...
......@@ -50,6 +50,8 @@ ext4.txt
- info, mount options and specifications for the Ext4 filesystem.
files.txt
- info on file management in the Linux kernel.
f2fs.txt
- info and mount options for the F2FS filesystem.
fuse.txt
- info on the Filesystem in User SpacE including mount options.
gfs2.txt
......
================================================================================
WHAT IS Flash-Friendly File System (F2FS)?
================================================================================
NAND flash memory-based storage devices, such as SSD, eMMC, and SD cards, have
been equipped on a variety systems ranging from mobile to server systems. Since
they are known to have different characteristics from the conventional rotating
disks, a file system, an upper layer to the storage device, should adapt to the
changes from the sketch in the design level.
F2FS is a file system exploiting NAND flash memory-based storage devices, which
is based on Log-structured File System (LFS). The design has been focused on
addressing the fundamental issues in LFS, which are snowball effect of wandering
tree and high cleaning overhead.
Since a NAND flash memory-based storage device shows different characteristic
according to its internal geometry or flash memory management scheme, namely FTL,
F2FS and its tools support various parameters not only for configuring on-disk
layout, but also for selecting allocation and cleaning algorithms.
The file system formatting tool, "mkfs.f2fs", is available from the following
git tree:
>> git://git.kernel.org/pub/scm/linux/kernel/git/jaegeuk/f2fs-tools.git
For reporting bugs and sending patches, please use the following mailing list:
>> linux-f2fs-devel@lists.sourceforge.net
================================================================================
BACKGROUND AND DESIGN ISSUES
================================================================================
Log-structured File System (LFS)
--------------------------------
"A log-structured file system writes all modifications to disk sequentially in
a log-like structure, thereby speeding up both file writing and crash recovery.
The log is the only structure on disk; it contains indexing information so that
files can be read back from the log efficiently. In order to maintain large free
areas on disk for fast writing, we divide the log into segments and use a
segment cleaner to compress the live information from heavily fragmented
segments." from Rosenblum, M. and Ousterhout, J. K., 1992, "The design and
implementation of a log-structured file system", ACM Trans. Computer Systems
10, 1, 26–52.
Wandering Tree Problem
----------------------
In LFS, when a file data is updated and written to the end of log, its direct
pointer block is updated due to the changed location. Then the indirect pointer
block is also updated due to the direct pointer block update. In this manner,
the upper index structures such as inode, inode map, and checkpoint block are
also updated recursively. This problem is called as wandering tree problem [1],
and in order to enhance the performance, it should eliminate or relax the update
propagation as much as possible.
[1] Bityutskiy, A. 2005. JFFS3 design issues. http://www.linux-mtd.infradead.org/
Cleaning Overhead
-----------------
Since LFS is based on out-of-place writes, it produces so many obsolete blocks
scattered across the whole storage. In order to serve new empty log space, it
needs to reclaim these obsolete blocks seamlessly to users. This job is called
as a cleaning process.
The process consists of three operations as follows.
1. A victim segment is selected through referencing segment usage table.
2. It loads parent index structures of all the data in the victim identified by
segment summary blocks.
3. It checks the cross-reference between the data and its parent index structure.
4. It moves valid data selectively.
This cleaning job may cause unexpected long delays, so the most important goal
is to hide the latencies to users. And also definitely, it should reduce the
amount of valid data to be moved, and move them quickly as well.
================================================================================
KEY FEATURES
================================================================================
Flash Awareness
---------------
- Enlarge the random write area for better performance, but provide the high
spatial locality
- Align FS data structures to the operational units in FTL as best efforts
Wandering Tree Problem
----------------------
- Use a term, “node”, that represents inodes as well as various pointer blocks
- Introduce Node Address Table (NAT) containing the locations of all the “node”
blocks; this will cut off the update propagation.
Cleaning Overhead
-----------------
- Support a background cleaning process
- Support greedy and cost-benefit algorithms for victim selection policies
- Support multi-head logs for static/dynamic hot and cold data separation
- Introduce adaptive logging for efficient block allocation
================================================================================
MOUNT OPTIONS
================================================================================
background_gc_off Turn off cleaning operations, namely garbage collection,
triggered in background when I/O subsystem is idle.
disable_roll_forward Disable the roll-forward recovery routine
discard Issue discard/TRIM commands when a segment is cleaned.
no_heap Disable heap-style segment allocation which finds free
segments for data from the beginning of main area, while
for node from the end of main area.
nouser_xattr Disable Extended User Attributes. Note: xattr is enabled
by default if CONFIG_F2FS_FS_XATTR is selected.
noacl Disable POSIX Access Control List. Note: acl is enabled
by default if CONFIG_F2FS_FS_POSIX_ACL is selected.
active_logs=%u Support configuring the number of active logs. In the
current design, f2fs supports only 2, 4, and 6 logs.
Default number is 6.
disable_ext_identify Disable the extension list configured by mkfs, so f2fs
does not aware of cold files such as media files.
================================================================================
DEBUGFS ENTRIES
================================================================================
/sys/kernel/debug/f2fs/ contains information about all the partitions mounted as
f2fs. Each file shows the whole f2fs information.
/sys/kernel/debug/f2fs/status includes:
- major file system information managed by f2fs currently
- average SIT information about whole segments
- current memory footprint consumed by f2fs.
================================================================================
USAGE
================================================================================
1. Download userland tools and compile them.
2. Skip, if f2fs was compiled statically inside kernel.
Otherwise, insert the f2fs.ko module.
# insmod f2fs.ko
3. Create a directory trying to mount
# mkdir /mnt/f2fs
4. Format the block device, and then mount as f2fs
# mkfs.f2fs -l label /dev/block_device
# mount -t f2fs /dev/block_device /mnt/f2fs
Format options
--------------
-l [label] : Give a volume label, up to 256 unicode name.
-a [0 or 1] : Split start location of each area for heap-based allocation.
1 is set by default, which performs this.
-o [int] : Set overprovision ratio in percent over volume size.
5 is set by default.
-s [int] : Set the number of segments per section.
1 is set by default.
-z [int] : Set the number of sections per zone.
1 is set by default.
-e [str] : Set basic extension list. e.g. "mp3,gif,mov"
================================================================================
DESIGN
================================================================================
On-disk Layout
--------------
F2FS divides the whole volume into a number of segments, each of which is fixed
to 2MB in size. A section is composed of consecutive segments, and a zone
consists of a set of sections. By default, section and zone sizes are set to one
segment size identically, but users can easily modify the sizes by mkfs.
F2FS splits the entire volume into six areas, and all the areas except superblock
consists of multiple segments as described below.
align with the zone size <-|
|-> align with the segment size
_________________________________________________________________________
| | | Node | Segment | Segment | |
| Superblock | Checkpoint | Address | Info. | Summary | Main |
| (SB) | (CP) | Table (NAT) | Table (SIT) | Area (SSA) | |
|____________|_____2______|______N______|______N______|______N_____|__N___|
. .
. .
. .
._________________________________________.
|_Segment_|_..._|_Segment_|_..._|_Segment_|
. .
._________._________
|_section_|__...__|_
. .
.________.
|__zone__|
- Superblock (SB)
: It is located at the beginning of the partition, and there exist two copies
to avoid file system crash. It contains basic partition information and some
default parameters of f2fs.
- Checkpoint (CP)
: It contains file system information, bitmaps for valid NAT/SIT sets, orphan
inode lists, and summary entries of current active segments.
- Node Address Table (NAT)
: It is composed of a block address table for all the node blocks stored in
Main area.
- Segment Information Table (SIT)
: It contains segment information such as valid block count and bitmap for the
validity of all the blocks.
- Segment Summary Area (SSA)
: It contains summary entries which contains the owner information of all the
data and node blocks stored in Main area.
- Main Area
: It contains file and directory data including their indices.
In order to avoid misalignment between file system and flash-based storage, F2FS
aligns the start block address of CP with the segment size. Also, it aligns the
start block address of Main area with the zone size by reserving some segments
in SSA area.
Reference the following survey for additional technical details.
https://wiki.linaro.org/WorkingGroups/Kernel/Projects/FlashCardSurvey
File System Metadata Structure
------------------------------
F2FS adopts the checkpointing scheme to maintain file system consistency. At
mount time, F2FS first tries to find the last valid checkpoint data by scanning
CP area. In order to reduce the scanning time, F2FS uses only two copies of CP.
One of them always indicates the last valid data, which is called as shadow copy
mechanism. In addition to CP, NAT and SIT also adopt the shadow copy mechanism.
For file system consistency, each CP points to which NAT and SIT copies are
valid, as shown as below.
+--------+----------+---------+
| CP | NAT | SIT |
+--------+----------+---------+
. . . .
. . . .
. . . .
+-------+-------+--------+--------+--------+--------+
| CP #0 | CP #1 | NAT #0 | NAT #1 | SIT #0 | SIT #1 |
+-------+-------+--------+--------+--------+--------+
| ^ ^
| | |
`----------------------------------------'
Index Structure
---------------
The key data structure to manage the data locations is a "node". Similar to
traditional file structures, F2FS has three types of node: inode, direct node,
indirect node. F2FS assigns 4KB to an inode block which contains 923 data block
indices, two direct node pointers, two indirect node pointers, and one double
indirect node pointer as described below. One direct node block contains 1018
data blocks, and one indirect node block contains also 1018 node blocks. Thus,
one inode block (i.e., a file) covers:
4KB * (923 + 2 * 1018 + 2 * 1018 * 1018 + 1018 * 1018 * 1018) := 3.94TB.
Inode block (4KB)
|- data (923)
|- direct node (2)
| `- data (1018)
|- indirect node (2)
| `- direct node (1018)
| `- data (1018)
`- double indirect node (1)
`- indirect node (1018)
`- direct node (1018)
`- data (1018)
Note that, all the node blocks are mapped by NAT which means the location of
each node is translated by the NAT table. In the consideration of the wandering
tree problem, F2FS is able to cut off the propagation of node updates caused by
leaf data writes.
Directory Structure
-------------------
A directory entry occupies 11 bytes, which consists of the following attributes.
- hash hash value of the file name
- ino inode number
- len the length of file name
- type file type such as directory, symlink, etc
A dentry block consists of 214 dentry slots and file names. Therein a bitmap is
used to represent whether each dentry is valid or not. A dentry block occupies
4KB with the following composition.
Dentry Block(4 K) = bitmap (27 bytes) + reserved (3 bytes) +
dentries(11 * 214 bytes) + file name (8 * 214 bytes)
[Bucket]
+--------------------------------+
|dentry block 1 | dentry block 2 |
+--------------------------------+
. .
. .
. [Dentry Block Structure: 4KB] .
+--------+----------+----------+------------+
| bitmap | reserved | dentries | file names |
+--------+----------+----------+------------+
[Dentry Block: 4KB] . .
. .
. .
+------+------+-----+------+
| hash | ino | len | type |
+------+------+-----+------+
[Dentry Structure: 11 bytes]
F2FS implements multi-level hash tables for directory structure. Each level has
a hash table with dedicated number of hash buckets as shown below. Note that
"A(2B)" means a bucket includes 2 data blocks.
----------------------
A : bucket
B : block
N : MAX_DIR_HASH_DEPTH
----------------------
level #0 | A(2B)
|
level #1 | A(2B) - A(2B)
|
level #2 | A(2B) - A(2B) - A(2B) - A(2B)
. | . . . .
level #N/2 | A(2B) - A(2B) - A(2B) - A(2B) - A(2B) - ... - A(2B)
. | . . . .
level #N | A(4B) - A(4B) - A(4B) - A(4B) - A(4B) - ... - A(4B)
The number of blocks and buckets are determined by,
,- 2, if n < MAX_DIR_HASH_DEPTH / 2,
# of blocks in level #n = |
`- 4, Otherwise
,- 2^n, if n < MAX_DIR_HASH_DEPTH / 2,
# of buckets in level #n = |
`- 2^((MAX_DIR_HASH_DEPTH / 2) - 1), Otherwise
When F2FS finds a file name in a directory, at first a hash value of the file
name is calculated. Then, F2FS scans the hash table in level #0 to find the
dentry consisting of the file name and its inode number. If not found, F2FS
scans the next hash table in level #1. In this way, F2FS scans hash tables in
each levels incrementally from 1 to N. In each levels F2FS needs to scan only
one bucket determined by the following equation, which shows O(log(# of files))
complexity.
bucket number to scan in level #n = (hash value) % (# of buckets in level #n)
In the case of file creation, F2FS finds empty consecutive slots that cover the
file name. F2FS searches the empty slots in the hash tables of whole levels from
1 to N in the same way as the lookup operation.
The following figure shows an example of two cases holding children.
--------------> Dir <--------------
| |
child child
child - child [hole] - child
child - child - child [hole] - [hole] - child
Case 1: Case 2:
Number of children = 6, Number of children = 3,
File size = 7 File size = 7
Default Block Allocation
------------------------
At runtime, F2FS manages six active logs inside "Main" area: Hot/Warm/Cold node
and Hot/Warm/Cold data.
- Hot node contains direct node blocks of directories.
- Warm node contains direct node blocks except hot node blocks.
- Cold node contains indirect node blocks
- Hot data contains dentry blocks
- Warm data contains data blocks except hot and cold data blocks
- Cold data contains multimedia data or migrated data blocks
LFS has two schemes for free space management: threaded log and copy-and-compac-
tion. The copy-and-compaction scheme which is known as cleaning, is well-suited
for devices showing very good sequential write performance, since free segments
are served all the time for writing new data. However, it suffers from cleaning
overhead under high utilization. Contrarily, the threaded log scheme suffers
from random writes, but no cleaning process is needed. F2FS adopts a hybrid
scheme where the copy-and-compaction scheme is adopted by default, but the
policy is dynamically changed to the threaded log scheme according to the file
system status.
In order to align F2FS with underlying flash-based storage, F2FS allocates a
segment in a unit of section. F2FS expects that the section size would be the
same as the unit size of garbage collection in FTL. Furthermore, with respect
to the mapping granularity in FTL, F2FS allocates each section of the active
logs from different zones as much as possible, since FTL can write the data in
the active logs into one allocation unit according to its mapping granularity.
Cleaning process
----------------
F2FS does cleaning both on demand and in the background. On-demand cleaning is
triggered when there are not enough free segments to serve VFS calls. Background
cleaner is operated by a kernel thread, and triggers the cleaning job when the
system is idle.
F2FS supports two victim selection policies: greedy and cost-benefit algorithms.
In the greedy algorithm, F2FS selects a victim segment having the smallest number
of valid blocks. In the cost-benefit algorithm, F2FS selects a victim segment
according to the segment age and the number of valid blocks in order to address
log block thrashing problem in the greedy algorithm. F2FS adopts the greedy
algorithm for on-demand cleaner, while background cleaner adopts cost-benefit
algorithm.
In order to identify whether the data in the victim segment are valid or not,
F2FS manages a bitmap. Each bit represents the validity of a block, and the
bitmap is composed of a bit stream covering whole blocks in main area.
......@@ -220,6 +220,7 @@ source "fs/pstore/Kconfig"
source "fs/sysv/Kconfig"
source "fs/ufs/Kconfig"
source "fs/exofs/Kconfig"
source "fs/f2fs/Kconfig"
endif # MISC_FILESYSTEMS
......
......@@ -123,6 +123,7 @@ obj-$(CONFIG_DEBUG_FS) += debugfs/
obj-$(CONFIG_OCFS2_FS) += ocfs2/
obj-$(CONFIG_BTRFS_FS) += btrfs/
obj-$(CONFIG_GFS2_FS) += gfs2/
obj-$(CONFIG_F2FS_FS) += f2fs/
obj-y += exofs/ # Multiple modules
obj-$(CONFIG_CEPH_FS) += ceph/
obj-$(CONFIG_PSTORE) += pstore/
config F2FS_FS
tristate "F2FS filesystem support (EXPERIMENTAL)"
depends on BLOCK
help
F2FS is based on Log-structured File System (LFS), which supports
versatile "flash-friendly" features. The design has been focused on
addressing the fundamental issues in LFS, which are snowball effect
of wandering tree and high cleaning overhead.
Since flash-based storages show different characteristics according to
the internal geometry or flash memory management schemes aka FTL, F2FS
and tools support various parameters not only for configuring on-disk
layout, but also for selecting allocation and cleaning algorithms.
If unsure, say N.
config F2FS_STAT_FS
bool "F2FS Status Information"
depends on F2FS_FS && DEBUG_FS
default y
help
/sys/kernel/debug/f2fs/ contains information about all the partitions
mounted as f2fs. Each file shows the whole f2fs information.
/sys/kernel/debug/f2fs/status includes:
- major file system information managed by f2fs currently
- average SIT information about whole segments
- current memory footprint consumed by f2fs.
config F2FS_FS_XATTR
bool "F2FS extended attributes"
depends on F2FS_FS
default y
help
Extended attributes are name:value pairs associated with inodes by
the kernel or by users (see the attr(5) manual page, or visit
<http://acl.bestbits.at/> for details).
If unsure, say N.
config F2FS_FS_POSIX_ACL
bool "F2FS Access Control Lists"
depends on F2FS_FS_XATTR
select FS_POSIX_ACL
default y
help
Posix Access Control Lists (ACLs) support permissions for users and
gourps beyond the owner/group/world scheme.
To learn more about Access Control Lists, visit the POSIX ACLs for
Linux website <http://acl.bestbits.at/>.
If you don't know what Access Control Lists are, say N
obj-$(CONFIG_F2FS_FS) += f2fs.o
f2fs-y := dir.o file.o inode.o namei.o hash.o super.o
f2fs-y += checkpoint.o gc.o data.o node.o segment.o recovery.o
f2fs-$(CONFIG_F2FS_STAT_FS) += debug.o
f2fs-$(CONFIG_F2FS_FS_XATTR) += xattr.o
f2fs-$(CONFIG_F2FS_FS_POSIX_ACL) += acl.o
/*
* fs/f2fs/acl.c
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* Portions of this code from linux/fs/ext2/acl.c
*
* Copyright (C) 2001-2003 Andreas Gruenbacher, <agruen@suse.de>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/f2fs_fs.h>
#include "f2fs.h"
#include "xattr.h"
#include "acl.h"
#define get_inode_mode(i) ((is_inode_flag_set(F2FS_I(i), FI_ACL_MODE)) ? \
(F2FS_I(i)->i_acl_mode) : ((i)->i_mode))
static inline size_t f2fs_acl_size(int count)
{
if (count <= 4) {
return sizeof(struct f2fs_acl_header) +
count * sizeof(struct f2fs_acl_entry_short);
} else {
return sizeof(struct f2fs_acl_header) +
4 * sizeof(struct f2fs_acl_entry_short) +
(count - 4) * sizeof(struct f2fs_acl_entry);
}
}
static inline int f2fs_acl_count(size_t size)
{
ssize_t s;
size -= sizeof(struct f2fs_acl_header);
s = size - 4 * sizeof(struct f2fs_acl_entry_short);
if (s < 0) {
if (size % sizeof(struct f2fs_acl_entry_short))
return -1;
return size / sizeof(struct f2fs_acl_entry_short);
} else {
if (s % sizeof(struct f2fs_acl_entry))
return -1;
return s / sizeof(struct f2fs_acl_entry) + 4;
}
}
static struct posix_acl *f2fs_acl_from_disk(const char *value, size_t size)
{
int i, count;
struct posix_acl *acl;
struct f2fs_acl_header *hdr = (struct f2fs_acl_header *)value;
struct f2fs_acl_entry *entry = (struct f2fs_acl_entry *)(hdr + 1);
const char *end = value + size;
if (hdr->a_version != cpu_to_le32(F2FS_ACL_VERSION))
return ERR_PTR(-EINVAL);
count = f2fs_acl_count(size);
if (count < 0)
return ERR_PTR(-EINVAL);
if (count == 0)
return NULL;
acl = posix_acl_alloc(count, GFP_KERNEL);
if (!acl)
return ERR_PTR(-ENOMEM);
for (i = 0; i < count; i++) {
if ((char *)entry > end)
goto fail;
acl->a_entries[i].e_tag = le16_to_cpu(entry->e_tag);
acl->a_entries[i].e_perm = le16_to_cpu(entry->e_perm);
switch (acl->a_entries[i].e_tag) {
case ACL_USER_OBJ:
case ACL_GROUP_OBJ:
case ACL_MASK:
case ACL_OTHER:
acl->a_entries[i].e_id = ACL_UNDEFINED_ID;
entry = (struct f2fs_acl_entry *)((char *)entry +
sizeof(struct f2fs_acl_entry_short));
break;
case ACL_USER:
acl->a_entries[i].e_uid =
make_kuid(&init_user_ns,
le32_to_cpu(entry->e_id));
entry = (struct f2fs_acl_entry *)((char *)entry +
sizeof(struct f2fs_acl_entry));
break;
case ACL_GROUP:
acl->a_entries[i].e_gid =
make_kgid(&init_user_ns,
le32_to_cpu(entry->e_id));
entry = (struct f2fs_acl_entry *)((char *)entry +
sizeof(struct f2fs_acl_entry));
break;
default:
goto fail;
}
}
if ((char *)entry != end)
goto fail;
return acl;
fail:
posix_acl_release(acl);
return ERR_PTR(-EINVAL);
}
static void *f2fs_acl_to_disk(const struct posix_acl *acl, size_t *size)
{
struct f2fs_acl_header *f2fs_acl;
struct f2fs_acl_entry *entry;
int i;
f2fs_acl = kmalloc(sizeof(struct f2fs_acl_header) + acl->a_count *
sizeof(struct f2fs_acl_entry), GFP_KERNEL);
if (!f2fs_acl)
return ERR_PTR(-ENOMEM);
f2fs_acl->a_version = cpu_to_le32(F2FS_ACL_VERSION);
entry = (struct f2fs_acl_entry *)(f2fs_acl + 1);
for (i = 0; i < acl->a_count; i++) {
entry->e_tag = cpu_to_le16(acl->a_entries[i].e_tag);
entry->e_perm = cpu_to_le16(acl->a_entries[i].e_perm);
switch (acl->a_entries[i].e_tag) {
case ACL_USER:
entry->e_id = cpu_to_le32(
from_kuid(&init_user_ns,
acl->a_entries[i].e_uid));
entry = (struct f2fs_acl_entry *)((char *)entry +
sizeof(struct f2fs_acl_entry));
break;
case ACL_GROUP:
entry->e_id = cpu_to_le32(
from_kgid(&init_user_ns,
acl->a_entries[i].e_gid));
entry = (struct f2fs_acl_entry *)((char *)entry +
sizeof(struct f2fs_acl_entry));
break;
case ACL_USER_OBJ:
case ACL_GROUP_OBJ:
case ACL_MASK:
case ACL_OTHER:
entry = (struct f2fs_acl_entry *)((char *)entry +
sizeof(struct f2fs_acl_entry_short));
break;
default:
goto fail;
}
}
*size = f2fs_acl_size(acl->a_count);
return (void *)f2fs_acl;
fail:
kfree(f2fs_acl);
return ERR_PTR(-EINVAL);
}
struct posix_acl *f2fs_get_acl(struct inode *inode, int type)
{
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
int name_index = F2FS_XATTR_INDEX_POSIX_ACL_DEFAULT;
void *value = NULL;
struct posix_acl *acl;
int retval;
if (!test_opt(sbi, POSIX_ACL))
return NULL;
acl = get_cached_acl(inode, type);
if (acl != ACL_NOT_CACHED)
return acl;
if (type == ACL_TYPE_ACCESS)
name_index = F2FS_XATTR_INDEX_POSIX_ACL_ACCESS;
retval = f2fs_getxattr(inode, name_index, "", NULL, 0);
if (retval > 0) {
value = kmalloc(retval, GFP_KERNEL);
if (!value)
return ERR_PTR(-ENOMEM);
retval = f2fs_getxattr(inode, name_index, "", value, retval);
}
if (retval < 0) {
if (retval == -ENODATA)
acl = NULL;
else
acl = ERR_PTR(retval);
} else {
acl = f2fs_acl_from_disk(value, retval);
}
kfree(value);
if (!IS_ERR(acl))
set_cached_acl(inode, type, acl);
return acl;
}
static int f2fs_set_acl(struct inode *inode, int type, struct posix_acl *acl)
{
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
struct f2fs_inode_info *fi = F2FS_I(inode);
int name_index;
void *value = NULL;
size_t size = 0;
int error;
if (!test_opt(sbi, POSIX_ACL))
return 0;
if (S_ISLNK(inode->i_mode))
return -EOPNOTSUPP;
switch (type) {
case ACL_TYPE_ACCESS:
name_index = F2FS_XATTR_INDEX_POSIX_ACL_ACCESS;
if (acl) {
error = posix_acl_equiv_mode(acl, &inode->i_mode);
if (error < 0)
return error;
set_acl_inode(fi, inode->i_mode);
if (error == 0)
acl = NULL;
}
break;
case ACL_TYPE_DEFAULT:
name_index = F2FS_XATTR_INDEX_POSIX_ACL_DEFAULT;
if (!S_ISDIR(inode->i_mode))
return acl ? -EACCES : 0;
break;
default:
return -EINVAL;
}
if (acl) {
value = f2fs_acl_to_disk(acl, &size);
if (IS_ERR(value)) {
cond_clear_inode_flag(fi, FI_ACL_MODE);
return (int)PTR_ERR(value);
}
}
error = f2fs_setxattr(inode, name_index, "", value, size);
kfree(value);
if (!error)
set_cached_acl(inode, type, acl);
cond_clear_inode_flag(fi, FI_ACL_MODE);
return error;
}
int f2fs_init_acl(struct inode *inode, struct inode *dir)
{
struct posix_acl *acl = NULL;
struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb);
int error = 0;
if (!S_ISLNK(inode->i_mode)) {
if (test_opt(sbi, POSIX_ACL)) {
acl = f2fs_get_acl(dir, ACL_TYPE_DEFAULT);
if (IS_ERR(acl))
return PTR_ERR(acl);
}
if (!acl)
inode->i_mode &= ~current_umask();
}
if (test_opt(sbi, POSIX_ACL) && acl) {
if (S_ISDIR(inode->i_mode)) {
error = f2fs_set_acl(inode, ACL_TYPE_DEFAULT, acl);
if (error)
goto cleanup;
}
error = posix_acl_create(&acl, GFP_KERNEL, &inode->i_mode);
if (error < 0)
return error;
if (error > 0)
error = f2fs_set_acl(inode, ACL_TYPE_ACCESS, acl);
}
cleanup:
posix_acl_release(acl);
return error;
}
int f2fs_acl_chmod(struct inode *inode)
{
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
struct posix_acl *acl;
int error;
mode_t mode = get_inode_mode(inode);
if (!test_opt(sbi, POSIX_ACL))
return 0;
if (S_ISLNK(mode))
return -EOPNOTSUPP;
acl = f2fs_get_acl(inode, ACL_TYPE_ACCESS);
if (IS_ERR(acl) || !acl)
return PTR_ERR(acl);
error = posix_acl_chmod(&acl, GFP_KERNEL, mode);
if (error)
return error;
error = f2fs_set_acl(inode, ACL_TYPE_ACCESS, acl);
posix_acl_release(acl);
return error;
}
static size_t f2fs_xattr_list_acl(struct dentry *dentry, char *list,
size_t list_size, const char *name, size_t name_len, int type)
{
struct f2fs_sb_info *sbi = F2FS_SB(dentry->d_sb);
const char *xname = POSIX_ACL_XATTR_DEFAULT;
size_t size;
if (!test_opt(sbi, POSIX_ACL))
return 0;
if (type == ACL_TYPE_ACCESS)
xname = POSIX_ACL_XATTR_ACCESS;
size = strlen(xname) + 1;
if (list && size <= list_size)
memcpy(list, xname, size);
return size;
}
static int f2fs_xattr_get_acl(struct dentry *dentry, const char *name,
void *buffer, size_t size, int type)
{
struct f2fs_sb_info *sbi = F2FS_SB(dentry->d_sb);
struct posix_acl *acl;
int error;
if (strcmp(name, "") != 0)
return -EINVAL;
if (!test_opt(sbi, POSIX_ACL))
return -EOPNOTSUPP;
acl = f2fs_get_acl(dentry->d_inode, type);
if (IS_ERR(acl))
return PTR_ERR(acl);
if (!acl)
return -ENODATA;
error = posix_acl_to_xattr(&init_user_ns, acl, buffer, size);
posix_acl_release(acl);
return error;
}
static int f2fs_xattr_set_acl(struct dentry *dentry, const char *name,
const void *value, size_t size, int flags, int type)
{
struct f2fs_sb_info *sbi = F2FS_SB(dentry->d_sb);
struct inode *inode = dentry->d_inode;
struct posix_acl *acl = NULL;
int error;
if (strcmp(name, "") != 0)
return -EINVAL;
if (!test_opt(sbi, POSIX_ACL))
return -EOPNOTSUPP;
if (!inode_owner_or_capable(inode))
return -EPERM;
if (value) {
acl = posix_acl_from_xattr(&init_user_ns, value, size);
if (IS_ERR(acl))
return PTR_ERR(acl);
if (acl) {
error = posix_acl_valid(acl);
if (error)
goto release_and_out;
}
} else {
acl = NULL;
}
error = f2fs_set_acl(inode, type, acl);
release_and_out:
posix_acl_release(acl);
return error;
}
const struct xattr_handler f2fs_xattr_acl_default_handler = {
.prefix = POSIX_ACL_XATTR_DEFAULT,
.flags = ACL_TYPE_DEFAULT,
.list = f2fs_xattr_list_acl,
.get = f2fs_xattr_get_acl,
.set = f2fs_xattr_set_acl,
};
const struct xattr_handler f2fs_xattr_acl_access_handler = {
.prefix = POSIX_ACL_XATTR_ACCESS,
.flags = ACL_TYPE_ACCESS,
.list = f2fs_xattr_list_acl,
.get = f2fs_xattr_get_acl,
.set = f2fs_xattr_set_acl,
};
/*
* fs/f2fs/acl.h
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* Portions of this code from linux/fs/ext2/acl.h
*
* Copyright (C) 2001-2003 Andreas Gruenbacher, <agruen@suse.de>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef __F2FS_ACL_H__
#define __F2FS_ACL_H__
#include <linux/posix_acl_xattr.h>
#define F2FS_ACL_VERSION 0x0001
struct f2fs_acl_entry {
__le16 e_tag;
__le16 e_perm;
__le32 e_id;
};
struct f2fs_acl_entry_short {
__le16 e_tag;
__le16 e_perm;
};
struct f2fs_acl_header {
__le32 a_version;
};
#ifdef CONFIG_F2FS_FS_POSIX_ACL
extern struct posix_acl *f2fs_get_acl(struct inode *inode, int type);
extern int f2fs_acl_chmod(struct inode *inode);
extern int f2fs_init_acl(struct inode *inode, struct inode *dir);
#else
#define f2fs_check_acl NULL
#define f2fs_get_acl NULL
#define f2fs_set_acl NULL
static inline int f2fs_acl_chmod(struct inode *inode)
{
return 0;
}
static inline int f2fs_init_acl(struct inode *inode, struct inode *dir)
{
return 0;
}
#endif
#endif /* __F2FS_ACL_H__ */
此差异已折叠。
/*
* fs/f2fs/data.c
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/fs.h>
#include <linux/f2fs_fs.h>
#include <linux/buffer_head.h>
#include <linux/mpage.h>
#include <linux/writeback.h>
#include <linux/backing-dev.h>
#include <linux/blkdev.h>
#include <linux/bio.h>
#include "f2fs.h"
#include "node.h"
#include "segment.h"
/*
* Lock ordering for the change of data block address:
* ->data_page
* ->node_page
* update block addresses in the node page
*/
static void __set_data_blkaddr(struct dnode_of_data *dn, block_t new_addr)
{
struct f2fs_node *rn;
__le32 *addr_array;
struct page *node_page = dn->node_page;
unsigned int ofs_in_node = dn->ofs_in_node;
wait_on_page_writeback(node_page);
rn = (struct f2fs_node *)page_address(node_page);
/* Get physical address of data block */
addr_array = blkaddr_in_node(rn);
addr_array[ofs_in_node] = cpu_to_le32(new_addr);
set_page_dirty(node_page);
}
int reserve_new_block(struct dnode_of_data *dn)
{
struct f2fs_sb_info *sbi = F2FS_SB(dn->inode->i_sb);
if (is_inode_flag_set(F2FS_I(dn->inode), FI_NO_ALLOC))
return -EPERM;
if (!inc_valid_block_count(sbi, dn->inode, 1))
return -ENOSPC;
__set_data_blkaddr(dn, NEW_ADDR);
dn->data_blkaddr = NEW_ADDR;
sync_inode_page(dn);
return 0;
}
static int check_extent_cache(struct inode *inode, pgoff_t pgofs,
struct buffer_head *bh_result)
{
struct f2fs_inode_info *fi = F2FS_I(inode);
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
pgoff_t start_fofs, end_fofs;
block_t start_blkaddr;
read_lock(&fi->ext.ext_lock);
if (fi->ext.len == 0) {
read_unlock(&fi->ext.ext_lock);
return 0;
}
sbi->total_hit_ext++;
start_fofs = fi->ext.fofs;
end_fofs = fi->ext.fofs + fi->ext.len - 1;
start_blkaddr = fi->ext.blk_addr;
if (pgofs >= start_fofs && pgofs <= end_fofs) {
unsigned int blkbits = inode->i_sb->s_blocksize_bits;
size_t count;
clear_buffer_new(bh_result);
map_bh(bh_result, inode->i_sb,
start_blkaddr + pgofs - start_fofs);
count = end_fofs - pgofs + 1;
if (count < (UINT_MAX >> blkbits))
bh_result->b_size = (count << blkbits);
else
bh_result->b_size = UINT_MAX;
sbi->read_hit_ext++;
read_unlock(&fi->ext.ext_lock);
return 1;
}
read_unlock(&fi->ext.ext_lock);
return 0;
}
void update_extent_cache(block_t blk_addr, struct dnode_of_data *dn)
{
struct f2fs_inode_info *fi = F2FS_I(dn->inode);
pgoff_t fofs, start_fofs, end_fofs;
block_t start_blkaddr, end_blkaddr;
BUG_ON(blk_addr == NEW_ADDR);
fofs = start_bidx_of_node(ofs_of_node(dn->node_page)) + dn->ofs_in_node;
/* Update the page address in the parent node */
__set_data_blkaddr(dn, blk_addr);
write_lock(&fi->ext.ext_lock);
start_fofs = fi->ext.fofs;
end_fofs = fi->ext.fofs + fi->ext.len - 1;
start_blkaddr = fi->ext.blk_addr;
end_blkaddr = fi->ext.blk_addr + fi->ext.len - 1;
/* Drop and initialize the matched extent */
if (fi->ext.len == 1 && fofs == start_fofs)
fi->ext.len = 0;
/* Initial extent */
if (fi->ext.len == 0) {
if (blk_addr != NULL_ADDR) {
fi->ext.fofs = fofs;
fi->ext.blk_addr = blk_addr;
fi->ext.len = 1;
}
goto end_update;
}
/* Frone merge */
if (fofs == start_fofs - 1 && blk_addr == start_blkaddr - 1) {
fi->ext.fofs--;
fi->ext.blk_addr--;
fi->ext.len++;
goto end_update;
}
/* Back merge */
if (fofs == end_fofs + 1 && blk_addr == end_blkaddr + 1) {
fi->ext.len++;
goto end_update;
}
/* Split the existing extent */
if (fi->ext.len > 1 &&
fofs >= start_fofs && fofs <= end_fofs) {
if ((end_fofs - fofs) < (fi->ext.len >> 1)) {
fi->ext.len = fofs - start_fofs;
} else {
fi->ext.fofs = fofs + 1;
fi->ext.blk_addr = start_blkaddr +
fofs - start_fofs + 1;
fi->ext.len -= fofs - start_fofs + 1;
}
goto end_update;
}
write_unlock(&fi->ext.ext_lock);
return;
end_update:
write_unlock(&fi->ext.ext_lock);
sync_inode_page(dn);
return;
}
struct page *find_data_page(struct inode *inode, pgoff_t index)
{
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
struct address_space *mapping = inode->i_mapping;
struct dnode_of_data dn;
struct page *page;
int err;
page = find_get_page(mapping, index);
if (page && PageUptodate(page))
return page;
f2fs_put_page(page, 0);
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, index, RDONLY_NODE);
if (err)
return ERR_PTR(err);
f2fs_put_dnode(&dn);
if (dn.data_blkaddr == NULL_ADDR)
return ERR_PTR(-ENOENT);
/* By fallocate(), there is no cached page, but with NEW_ADDR */
if (dn.data_blkaddr == NEW_ADDR)
return ERR_PTR(-EINVAL);
page = grab_cache_page(mapping, index);
if (!page)
return ERR_PTR(-ENOMEM);
err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
if (err) {
f2fs_put_page(page, 1);
return ERR_PTR(err);
}
unlock_page(page);
return page;
}
/*
* If it tries to access a hole, return an error.
* Because, the callers, functions in dir.c and GC, should be able to know
* whether this page exists or not.
*/
struct page *get_lock_data_page(struct inode *inode, pgoff_t index)
{
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
struct address_space *mapping = inode->i_mapping;
struct dnode_of_data dn;
struct page *page;
int err;
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, index, RDONLY_NODE);
if (err)
return ERR_PTR(err);
f2fs_put_dnode(&dn);
if (dn.data_blkaddr == NULL_ADDR)
return ERR_PTR(-ENOENT);
page = grab_cache_page(mapping, index);
if (!page)
return ERR_PTR(-ENOMEM);
if (PageUptodate(page))
return page;
BUG_ON(dn.data_blkaddr == NEW_ADDR);
BUG_ON(dn.data_blkaddr == NULL_ADDR);
err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
if (err) {
f2fs_put_page(page, 1);
return ERR_PTR(err);
}
return page;
}
/*
* Caller ensures that this data page is never allocated.
* A new zero-filled data page is allocated in the page cache.
*/
struct page *get_new_data_page(struct inode *inode, pgoff_t index,
bool new_i_size)
{
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
struct address_space *mapping = inode->i_mapping;
struct page *page;
struct dnode_of_data dn;
int err;
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, index, 0);
if (err)
return ERR_PTR(err);
if (dn.data_blkaddr == NULL_ADDR) {
if (reserve_new_block(&dn)) {
f2fs_put_dnode(&dn);
return ERR_PTR(-ENOSPC);
}
}
f2fs_put_dnode(&dn);
page = grab_cache_page(mapping, index);
if (!page)
return ERR_PTR(-ENOMEM);
if (PageUptodate(page))
return page;
if (dn.data_blkaddr == NEW_ADDR) {
zero_user_segment(page, 0, PAGE_CACHE_SIZE);
} else {
err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
if (err) {
f2fs_put_page(page, 1);
return ERR_PTR(err);
}
}
SetPageUptodate(page);
if (new_i_size &&
i_size_read(inode) < ((index + 1) << PAGE_CACHE_SHIFT)) {
i_size_write(inode, ((index + 1) << PAGE_CACHE_SHIFT));
mark_inode_dirty_sync(inode);
}
return page;
}
static void read_end_io(struct bio *bio, int err)
{
const int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
do {
struct page *page = bvec->bv_page;
if (--bvec >= bio->bi_io_vec)
prefetchw(&bvec->bv_page->flags);
if (uptodate) {
SetPageUptodate(page);
} else {
ClearPageUptodate(page);
SetPageError(page);
}
unlock_page(page);
} while (bvec >= bio->bi_io_vec);
kfree(bio->bi_private);
bio_put(bio);
}
/*
* Fill the locked page with data located in the block address.
* Read operation is synchronous, and caller must unlock the page.
*/
int f2fs_readpage(struct f2fs_sb_info *sbi, struct page *page,
block_t blk_addr, int type)
{
struct block_device *bdev = sbi->sb->s_bdev;
bool sync = (type == READ_SYNC);
struct bio *bio;
/* This page can be already read by other threads */
if (PageUptodate(page)) {
if (!sync)
unlock_page(page);
return 0;
}
down_read(&sbi->bio_sem);
/* Allocate a new bio */
bio = f2fs_bio_alloc(bdev, 1);
/* Initialize the bio */
bio->bi_sector = SECTOR_FROM_BLOCK(sbi, blk_addr);
bio->bi_end_io = read_end_io;
if (bio_add_page(bio, page, PAGE_CACHE_SIZE, 0) < PAGE_CACHE_SIZE) {
kfree(bio->bi_private);
bio_put(bio);
up_read(&sbi->bio_sem);
return -EFAULT;
}
submit_bio(type, bio);
up_read(&sbi->bio_sem);
/* wait for read completion if sync */
if (sync) {
lock_page(page);
if (PageError(page))
return -EIO;
}
return 0;
}
/*
* This function should be used by the data read flow only where it
* does not check the "create" flag that indicates block allocation.
* The reason for this special functionality is to exploit VFS readahead
* mechanism.
*/
static int get_data_block_ro(struct inode *inode, sector_t iblock,
struct buffer_head *bh_result, int create)
{
unsigned int blkbits = inode->i_sb->s_blocksize_bits;
unsigned maxblocks = bh_result->b_size >> blkbits;
struct dnode_of_data dn;
pgoff_t pgofs;
int err;
/* Get the page offset from the block offset(iblock) */
pgofs = (pgoff_t)(iblock >> (PAGE_CACHE_SHIFT - blkbits));
if (check_extent_cache(inode, pgofs, bh_result))
return 0;
/* When reading holes, we need its node page */
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, pgofs, RDONLY_NODE);
if (err)
return (err == -ENOENT) ? 0 : err;
/* It does not support data allocation */
BUG_ON(create);
if (dn.data_blkaddr != NEW_ADDR && dn.data_blkaddr != NULL_ADDR) {
int i;
unsigned int end_offset;
end_offset = IS_INODE(dn.node_page) ?
ADDRS_PER_INODE :
ADDRS_PER_BLOCK;
clear_buffer_new(bh_result);
/* Give more consecutive addresses for the read ahead */
for (i = 0; i < end_offset - dn.ofs_in_node; i++)
if (((datablock_addr(dn.node_page,
dn.ofs_in_node + i))
!= (dn.data_blkaddr + i)) || maxblocks == i)
break;
map_bh(bh_result, inode->i_sb, dn.data_blkaddr);
bh_result->b_size = (i << blkbits);
}
f2fs_put_dnode(&dn);
return 0;
}
static int f2fs_read_data_page(struct file *file, struct page *page)
{
return mpage_readpage(page, get_data_block_ro);
}
static int f2fs_read_data_pages(struct file *file,
struct address_space *mapping,
struct list_head *pages, unsigned nr_pages)
{
return mpage_readpages(mapping, pages, nr_pages, get_data_block_ro);
}
int do_write_data_page(struct page *page)
{
struct inode *inode = page->mapping->host;
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
block_t old_blk_addr, new_blk_addr;
struct dnode_of_data dn;
int err = 0;
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, page->index, RDONLY_NODE);
if (err)
return err;
old_blk_addr = dn.data_blkaddr;
/* This page is already truncated */
if (old_blk_addr == NULL_ADDR)
goto out_writepage;
set_page_writeback(page);
/*
* If current allocation needs SSR,
* it had better in-place writes for updated data.
*/
if (old_blk_addr != NEW_ADDR && !is_cold_data(page) &&
need_inplace_update(inode)) {
rewrite_data_page(F2FS_SB(inode->i_sb), page,
old_blk_addr);
} else {
write_data_page(inode, page, &dn,
old_blk_addr, &new_blk_addr);
update_extent_cache(new_blk_addr, &dn);
F2FS_I(inode)->data_version =
le64_to_cpu(F2FS_CKPT(sbi)->checkpoint_ver);
}
out_writepage:
f2fs_put_dnode(&dn);
return err;
}
static int f2fs_write_data_page(struct page *page,
struct writeback_control *wbc)
{
struct inode *inode = page->mapping->host;
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
loff_t i_size = i_size_read(inode);
const pgoff_t end_index = ((unsigned long long) i_size)
>> PAGE_CACHE_SHIFT;
unsigned offset;
int err = 0;
if (page->index < end_index)
goto out;
/*
* If the offset is out-of-range of file size,
* this page does not have to be written to disk.
*/
offset = i_size & (PAGE_CACHE_SIZE - 1);
if ((page->index >= end_index + 1) || !offset) {
if (S_ISDIR(inode->i_mode)) {
dec_page_count(sbi, F2FS_DIRTY_DENTS);
inode_dec_dirty_dents(inode);
}
goto unlock_out;
}
zero_user_segment(page, offset, PAGE_CACHE_SIZE);
out:
if (sbi->por_doing)
goto redirty_out;
if (wbc->for_reclaim && !S_ISDIR(inode->i_mode) && !is_cold_data(page))
goto redirty_out;
mutex_lock_op(sbi, DATA_WRITE);
if (S_ISDIR(inode->i_mode)) {
dec_page_count(sbi, F2FS_DIRTY_DENTS);
inode_dec_dirty_dents(inode);
}
err = do_write_data_page(page);
if (err && err != -ENOENT) {
wbc->pages_skipped++;
set_page_dirty(page);
}
mutex_unlock_op(sbi, DATA_WRITE);
if (wbc->for_reclaim)
f2fs_submit_bio(sbi, DATA, true);
if (err == -ENOENT)
goto unlock_out;
clear_cold_data(page);
unlock_page(page);
if (!wbc->for_reclaim && !S_ISDIR(inode->i_mode))
f2fs_balance_fs(sbi);
return 0;
unlock_out:
unlock_page(page);
return (err == -ENOENT) ? 0 : err;
redirty_out:
wbc->pages_skipped++;
set_page_dirty(page);
return AOP_WRITEPAGE_ACTIVATE;
}
#define MAX_DESIRED_PAGES_WP 4096
static int f2fs_write_data_pages(struct address_space *mapping,
struct writeback_control *wbc)
{
struct inode *inode = mapping->host;
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
int ret;
long excess_nrtw = 0, desired_nrtw;
if (wbc->nr_to_write < MAX_DESIRED_PAGES_WP) {
desired_nrtw = MAX_DESIRED_PAGES_WP;
excess_nrtw = desired_nrtw - wbc->nr_to_write;
wbc->nr_to_write = desired_nrtw;
}
if (!S_ISDIR(inode->i_mode))
mutex_lock(&sbi->writepages);
ret = generic_writepages(mapping, wbc);
if (!S_ISDIR(inode->i_mode))
mutex_unlock(&sbi->writepages);
f2fs_submit_bio(sbi, DATA, (wbc->sync_mode == WB_SYNC_ALL));
remove_dirty_dir_inode(inode);
wbc->nr_to_write -= excess_nrtw;
return ret;
}
static int f2fs_write_begin(struct file *file, struct address_space *mapping,
loff_t pos, unsigned len, unsigned flags,
struct page **pagep, void **fsdata)
{
struct inode *inode = mapping->host;
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
struct page *page;
pgoff_t index = ((unsigned long long) pos) >> PAGE_CACHE_SHIFT;
struct dnode_of_data dn;
int err = 0;
/* for nobh_write_end */
*fsdata = NULL;
f2fs_balance_fs(sbi);
page = grab_cache_page_write_begin(mapping, index, flags);
if (!page)
return -ENOMEM;
*pagep = page;
mutex_lock_op(sbi, DATA_NEW);
set_new_dnode(&dn, inode, NULL, NULL, 0);
err = get_dnode_of_data(&dn, index, 0);
if (err) {
mutex_unlock_op(sbi, DATA_NEW);
f2fs_put_page(page, 1);
return err;
}
if (dn.data_blkaddr == NULL_ADDR) {
err = reserve_new_block(&dn);
if (err) {
f2fs_put_dnode(&dn);
mutex_unlock_op(sbi, DATA_NEW);
f2fs_put_page(page, 1);
return err;
}
}
f2fs_put_dnode(&dn);
mutex_unlock_op(sbi, DATA_NEW);
if ((len == PAGE_CACHE_SIZE) || PageUptodate(page))
return 0;
if ((pos & PAGE_CACHE_MASK) >= i_size_read(inode)) {
unsigned start = pos & (PAGE_CACHE_SIZE - 1);
unsigned end = start + len;
/* Reading beyond i_size is simple: memset to zero */
zero_user_segments(page, 0, start, end, PAGE_CACHE_SIZE);
return 0;
}
if (dn.data_blkaddr == NEW_ADDR) {
zero_user_segment(page, 0, PAGE_CACHE_SIZE);
} else {
err = f2fs_readpage(sbi, page, dn.data_blkaddr, READ_SYNC);
if (err) {
f2fs_put_page(page, 1);
return err;
}
}
SetPageUptodate(page);
clear_cold_data(page);
return 0;
}
static ssize_t f2fs_direct_IO(int rw, struct kiocb *iocb,
const struct iovec *iov, loff_t offset, unsigned long nr_segs)
{
struct file *file = iocb->ki_filp;
struct inode *inode = file->f_mapping->host;
if (rw == WRITE)
return 0;
/* Needs synchronization with the cleaner */
return blockdev_direct_IO(rw, iocb, inode, iov, offset, nr_segs,
get_data_block_ro);
}
static void f2fs_invalidate_data_page(struct page *page, unsigned long offset)
{
struct inode *inode = page->mapping->host;
struct f2fs_sb_info *sbi = F2FS_SB(inode->i_sb);
if (S_ISDIR(inode->i_mode) && PageDirty(page)) {
dec_page_count(sbi, F2FS_DIRTY_DENTS);
inode_dec_dirty_dents(inode);
}
ClearPagePrivate(page);
}
static int f2fs_release_data_page(struct page *page, gfp_t wait)
{
ClearPagePrivate(page);
return 0;
}
static int f2fs_set_data_page_dirty(struct page *page)
{
struct address_space *mapping = page->mapping;
struct inode *inode = mapping->host;
SetPageUptodate(page);
if (!PageDirty(page)) {
__set_page_dirty_nobuffers(page);
set_dirty_dir_page(inode, page);
return 1;
}
return 0;
}
const struct address_space_operations f2fs_dblock_aops = {
.readpage = f2fs_read_data_page,
.readpages = f2fs_read_data_pages,
.writepage = f2fs_write_data_page,
.writepages = f2fs_write_data_pages,
.write_begin = f2fs_write_begin,
.write_end = nobh_write_end,
.set_page_dirty = f2fs_set_data_page_dirty,
.invalidatepage = f2fs_invalidate_data_page,
.releasepage = f2fs_release_data_page,
.direct_IO = f2fs_direct_IO,
};
/*
* f2fs debugging statistics
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
* Copyright (c) 2012 Linux Foundation
* Copyright (c) 2012 Greg Kroah-Hartman <gregkh@linuxfoundation.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/fs.h>
#include <linux/backing-dev.h>
#include <linux/proc_fs.h>
#include <linux/f2fs_fs.h>
#include <linux/blkdev.h>
#include <linux/debugfs.h>
#include <linux/seq_file.h>
#include "f2fs.h"
#include "node.h"
#include "segment.h"
#include "gc.h"
static LIST_HEAD(f2fs_stat_list);
static struct dentry *debugfs_root;
static void update_general_status(struct f2fs_sb_info *sbi)
{
struct f2fs_stat_info *si = sbi->stat_info;
int i;
/* valid check of the segment numbers */
si->hit_ext = sbi->read_hit_ext;
si->total_ext = sbi->total_hit_ext;
si->ndirty_node = get_pages(sbi, F2FS_DIRTY_NODES);
si->ndirty_dent = get_pages(sbi, F2FS_DIRTY_DENTS);
si->ndirty_dirs = sbi->n_dirty_dirs;
si->ndirty_meta = get_pages(sbi, F2FS_DIRTY_META);
si->total_count = (int)sbi->user_block_count / sbi->blocks_per_seg;
si->rsvd_segs = reserved_segments(sbi);
si->overp_segs = overprovision_segments(sbi);
si->valid_count = valid_user_blocks(sbi);
si->valid_node_count = valid_node_count(sbi);
si->valid_inode_count = valid_inode_count(sbi);
si->utilization = utilization(sbi);
si->free_segs = free_segments(sbi);
si->free_secs = free_sections(sbi);
si->prefree_count = prefree_segments(sbi);
si->dirty_count = dirty_segments(sbi);
si->node_pages = sbi->node_inode->i_mapping->nrpages;
si->meta_pages = sbi->meta_inode->i_mapping->nrpages;
si->nats = NM_I(sbi)->nat_cnt;
si->sits = SIT_I(sbi)->dirty_sentries;
si->fnids = NM_I(sbi)->fcnt;
si->bg_gc = sbi->bg_gc;
si->util_free = (int)(free_user_blocks(sbi) >> sbi->log_blocks_per_seg)
* 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
/ 2;
si->util_valid = (int)(written_block_count(sbi) >>
sbi->log_blocks_per_seg)
* 100 / (int)(sbi->user_block_count >> sbi->log_blocks_per_seg)
/ 2;
si->util_invalid = 50 - si->util_free - si->util_valid;
for (i = CURSEG_HOT_DATA; i <= CURSEG_COLD_NODE; i++) {
struct curseg_info *curseg = CURSEG_I(sbi, i);
si->curseg[i] = curseg->segno;
si->cursec[i] = curseg->segno / sbi->segs_per_sec;
si->curzone[i] = si->cursec[i] / sbi->secs_per_zone;
}
for (i = 0; i < 2; i++) {
si->segment_count[i] = sbi->segment_count[i];
si->block_count[i] = sbi->block_count[i];
}
}
/*
* This function calculates BDF of every segments
*/
static void update_sit_info(struct f2fs_sb_info *sbi)
{
struct f2fs_stat_info *si = sbi->stat_info;
unsigned int blks_per_sec, hblks_per_sec, total_vblocks, bimodal, dist;
struct sit_info *sit_i = SIT_I(sbi);
unsigned int segno, vblocks;
int ndirty = 0;
bimodal = 0;
total_vblocks = 0;
blks_per_sec = sbi->segs_per_sec * (1 << sbi->log_blocks_per_seg);
hblks_per_sec = blks_per_sec / 2;
mutex_lock(&sit_i->sentry_lock);
for (segno = 0; segno < TOTAL_SEGS(sbi); segno += sbi->segs_per_sec) {
vblocks = get_valid_blocks(sbi, segno, sbi->segs_per_sec);
dist = abs(vblocks - hblks_per_sec);
bimodal += dist * dist;
if (vblocks > 0 && vblocks < blks_per_sec) {
total_vblocks += vblocks;
ndirty++;
}
}
mutex_unlock(&sit_i->sentry_lock);
dist = sbi->total_sections * hblks_per_sec * hblks_per_sec / 100;
si->bimodal = bimodal / dist;
if (si->dirty_count)
si->avg_vblocks = total_vblocks / ndirty;
else
si->avg_vblocks = 0;
}
/*
* This function calculates memory footprint.
*/
static void update_mem_info(struct f2fs_sb_info *sbi)
{
struct f2fs_stat_info *si = sbi->stat_info;
unsigned npages;
if (si->base_mem)
goto get_cache;
si->base_mem = sizeof(struct f2fs_sb_info) + sbi->sb->s_blocksize;
si->base_mem += 2 * sizeof(struct f2fs_inode_info);
si->base_mem += sizeof(*sbi->ckpt);
/* build sm */
si->base_mem += sizeof(struct f2fs_sm_info);
/* build sit */
si->base_mem += sizeof(struct sit_info);
si->base_mem += TOTAL_SEGS(sbi) * sizeof(struct seg_entry);
si->base_mem += f2fs_bitmap_size(TOTAL_SEGS(sbi));
si->base_mem += 2 * SIT_VBLOCK_MAP_SIZE * TOTAL_SEGS(sbi);
if (sbi->segs_per_sec > 1)
si->base_mem += sbi->total_sections *
sizeof(struct sec_entry);
si->base_mem += __bitmap_size(sbi, SIT_BITMAP);
/* build free segmap */
si->base_mem += sizeof(struct free_segmap_info);
si->base_mem += f2fs_bitmap_size(TOTAL_SEGS(sbi));
si->base_mem += f2fs_bitmap_size(sbi->total_sections);
/* build curseg */
si->base_mem += sizeof(struct curseg_info) * NR_CURSEG_TYPE;
si->base_mem += PAGE_CACHE_SIZE * NR_CURSEG_TYPE;
/* build dirty segmap */
si->base_mem += sizeof(struct dirty_seglist_info);
si->base_mem += NR_DIRTY_TYPE * f2fs_bitmap_size(TOTAL_SEGS(sbi));
si->base_mem += 2 * f2fs_bitmap_size(TOTAL_SEGS(sbi));
/* buld nm */
si->base_mem += sizeof(struct f2fs_nm_info);
si->base_mem += __bitmap_size(sbi, NAT_BITMAP);
/* build gc */
si->base_mem += sizeof(struct f2fs_gc_kthread);
get_cache:
/* free nids */
si->cache_mem = NM_I(sbi)->fcnt;
si->cache_mem += NM_I(sbi)->nat_cnt;
npages = sbi->node_inode->i_mapping->nrpages;
si->cache_mem += npages << PAGE_CACHE_SHIFT;
npages = sbi->meta_inode->i_mapping->nrpages;
si->cache_mem += npages << PAGE_CACHE_SHIFT;
si->cache_mem += sbi->n_orphans * sizeof(struct orphan_inode_entry);
si->cache_mem += sbi->n_dirty_dirs * sizeof(struct dir_inode_entry);
}
static int stat_show(struct seq_file *s, void *v)
{
struct f2fs_stat_info *si, *next;
int i = 0;
int j;
list_for_each_entry_safe(si, next, &f2fs_stat_list, stat_list) {
mutex_lock(&si->stat_lock);
if (!si->sbi) {
mutex_unlock(&si->stat_lock);
continue;
}
update_general_status(si->sbi);
seq_printf(s, "\n=====[ partition info. #%d ]=====\n", i++);
seq_printf(s, "[SB: 1] [CP: 2] [NAT: %d] [SIT: %d] ",
si->nat_area_segs, si->sit_area_segs);
seq_printf(s, "[SSA: %d] [MAIN: %d",
si->ssa_area_segs, si->main_area_segs);
seq_printf(s, "(OverProv:%d Resv:%d)]\n\n",
si->overp_segs, si->rsvd_segs);
seq_printf(s, "Utilization: %d%% (%d valid blocks)\n",
si->utilization, si->valid_count);
seq_printf(s, " - Node: %u (Inode: %u, ",
si->valid_node_count, si->valid_inode_count);
seq_printf(s, "Other: %u)\n - Data: %u\n",
si->valid_node_count - si->valid_inode_count,
si->valid_count - si->valid_node_count);
seq_printf(s, "\nMain area: %d segs, %d secs %d zones\n",
si->main_area_segs, si->main_area_sections,
si->main_area_zones);
seq_printf(s, " - COLD data: %d, %d, %d\n",
si->curseg[CURSEG_COLD_DATA],
si->cursec[CURSEG_COLD_DATA],
si->curzone[CURSEG_COLD_DATA]);
seq_printf(s, " - WARM data: %d, %d, %d\n",
si->curseg[CURSEG_WARM_DATA],
si->cursec[CURSEG_WARM_DATA],
si->curzone[CURSEG_WARM_DATA]);
seq_printf(s, " - HOT data: %d, %d, %d\n",
si->curseg[CURSEG_HOT_DATA],
si->cursec[CURSEG_HOT_DATA],
si->curzone[CURSEG_HOT_DATA]);
seq_printf(s, " - Dir dnode: %d, %d, %d\n",
si->curseg[CURSEG_HOT_NODE],
si->cursec[CURSEG_HOT_NODE],
si->curzone[CURSEG_HOT_NODE]);
seq_printf(s, " - File dnode: %d, %d, %d\n",
si->curseg[CURSEG_WARM_NODE],
si->cursec[CURSEG_WARM_NODE],
si->curzone[CURSEG_WARM_NODE]);
seq_printf(s, " - Indir nodes: %d, %d, %d\n",
si->curseg[CURSEG_COLD_NODE],
si->cursec[CURSEG_COLD_NODE],
si->curzone[CURSEG_COLD_NODE]);
seq_printf(s, "\n - Valid: %d\n - Dirty: %d\n",
si->main_area_segs - si->dirty_count -
si->prefree_count - si->free_segs,
si->dirty_count);
seq_printf(s, " - Prefree: %d\n - Free: %d (%d)\n\n",
si->prefree_count, si->free_segs, si->free_secs);
seq_printf(s, "GC calls: %d (BG: %d)\n",
si->call_count, si->bg_gc);
seq_printf(s, " - data segments : %d\n", si->data_segs);
seq_printf(s, " - node segments : %d\n", si->node_segs);
seq_printf(s, "Try to move %d blocks\n", si->tot_blks);
seq_printf(s, " - data blocks : %d\n", si->data_blks);
seq_printf(s, " - node blocks : %d\n", si->node_blks);
seq_printf(s, "\nExtent Hit Ratio: %d / %d\n",
si->hit_ext, si->total_ext);
seq_printf(s, "\nBalancing F2FS Async:\n");
seq_printf(s, " - nodes %4d in %4d\n",
si->ndirty_node, si->node_pages);
seq_printf(s, " - dents %4d in dirs:%4d\n",
si->ndirty_dent, si->ndirty_dirs);
seq_printf(s, " - meta %4d in %4d\n",
si->ndirty_meta, si->meta_pages);
seq_printf(s, " - NATs %5d > %lu\n",
si->nats, NM_WOUT_THRESHOLD);
seq_printf(s, " - SITs: %5d\n - free_nids: %5d\n",
si->sits, si->fnids);
seq_printf(s, "\nDistribution of User Blocks:");
seq_printf(s, " [ valid | invalid | free ]\n");
seq_printf(s, " [");
for (j = 0; j < si->util_valid; j++)
seq_printf(s, "-");
seq_printf(s, "|");
for (j = 0; j < si->util_invalid; j++)
seq_printf(s, "-");
seq_printf(s, "|");
for (j = 0; j < si->util_free; j++)
seq_printf(s, "-");
seq_printf(s, "]\n\n");
seq_printf(s, "SSR: %u blocks in %u segments\n",
si->block_count[SSR], si->segment_count[SSR]);
seq_printf(s, "LFS: %u blocks in %u segments\n",
si->block_count[LFS], si->segment_count[LFS]);
/* segment usage info */
update_sit_info(si->sbi);
seq_printf(s, "\nBDF: %u, avg. vblocks: %u\n",
si->bimodal, si->avg_vblocks);
/* memory footprint */
update_mem_info(si->sbi);
seq_printf(s, "\nMemory: %u KB = static: %u + cached: %u\n",
(si->base_mem + si->cache_mem) >> 10,
si->base_mem >> 10, si->cache_mem >> 10);
mutex_unlock(&si->stat_lock);
}
return 0;
}
static int stat_open(struct inode *inode, struct file *file)
{
return single_open(file, stat_show, inode->i_private);
}
static const struct file_operations stat_fops = {
.open = stat_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
};
static int init_stats(struct f2fs_sb_info *sbi)
{
struct f2fs_super_block *raw_super = F2FS_RAW_SUPER(sbi);
struct f2fs_stat_info *si;
sbi->stat_info = kzalloc(sizeof(struct f2fs_stat_info), GFP_KERNEL);
if (!sbi->stat_info)
return -ENOMEM;
si = sbi->stat_info;
mutex_init(&si->stat_lock);
list_add_tail(&si->stat_list, &f2fs_stat_list);
si->all_area_segs = le32_to_cpu(raw_super->segment_count);
si->sit_area_segs = le32_to_cpu(raw_super->segment_count_sit);
si->nat_area_segs = le32_to_cpu(raw_super->segment_count_nat);
si->ssa_area_segs = le32_to_cpu(raw_super->segment_count_ssa);
si->main_area_segs = le32_to_cpu(raw_super->segment_count_main);
si->main_area_sections = le32_to_cpu(raw_super->section_count);
si->main_area_zones = si->main_area_sections /
le32_to_cpu(raw_super->secs_per_zone);
si->sbi = sbi;
return 0;
}
int f2fs_build_stats(struct f2fs_sb_info *sbi)
{
int retval;
retval = init_stats(sbi);
if (retval)
return retval;
if (!debugfs_root)
debugfs_root = debugfs_create_dir("f2fs", NULL);
debugfs_create_file("status", S_IRUGO, debugfs_root, NULL, &stat_fops);
return 0;
}
void f2fs_destroy_stats(struct f2fs_sb_info *sbi)
{
struct f2fs_stat_info *si = sbi->stat_info;
list_del(&si->stat_list);
mutex_lock(&si->stat_lock);
si->sbi = NULL;
mutex_unlock(&si->stat_lock);
kfree(sbi->stat_info);
}
void destroy_root_stats(void)
{
debugfs_remove_recursive(debugfs_root);
debugfs_root = NULL;
}
/*
* fs/f2fs/dir.c
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/fs.h>
#include <linux/f2fs_fs.h>
#include "f2fs.h"
#include "acl.h"
static unsigned long dir_blocks(struct inode *inode)
{
return ((unsigned long long) (i_size_read(inode) + PAGE_CACHE_SIZE - 1))
>> PAGE_CACHE_SHIFT;
}
static unsigned int dir_buckets(unsigned int level)
{
if (level < MAX_DIR_HASH_DEPTH / 2)
return 1 << level;
else
return 1 << ((MAX_DIR_HASH_DEPTH / 2) - 1);
}
static unsigned int bucket_blocks(unsigned int level)
{
if (level < MAX_DIR_HASH_DEPTH / 2)
return 2;
else
return 4;
}
static unsigned char f2fs_filetype_table[F2FS_FT_MAX] = {
[F2FS_FT_UNKNOWN] = DT_UNKNOWN,
[F2FS_FT_REG_FILE] = DT_REG,
[F2FS_FT_DIR] = DT_DIR,
[F2FS_FT_CHRDEV] = DT_CHR,
[F2FS_FT_BLKDEV] = DT_BLK,
[F2FS_FT_FIFO] = DT_FIFO,
[F2FS_FT_SOCK] = DT_SOCK,
[F2FS_FT_SYMLINK] = DT_LNK,
};
#define S_SHIFT 12
static unsigned char f2fs_type_by_mode[S_IFMT >> S_SHIFT] = {
[S_IFREG >> S_SHIFT] = F2FS_FT_REG_FILE,
[S_IFDIR >> S_SHIFT] = F2FS_FT_DIR,
[S_IFCHR >> S_SHIFT] = F2FS_FT_CHRDEV,
[S_IFBLK >> S_SHIFT] = F2FS_FT_BLKDEV,
[S_IFIFO >> S_SHIFT] = F2FS_FT_FIFO,
[S_IFSOCK >> S_SHIFT] = F2FS_FT_SOCK,
[S_IFLNK >> S_SHIFT] = F2FS_FT_SYMLINK,
};
static void set_de_type(struct f2fs_dir_entry *de, struct inode *inode)
{
mode_t mode = inode->i_mode;
de->file_type = f2fs_type_by_mode[(mode & S_IFMT) >> S_SHIFT];
}
static unsigned long dir_block_index(unsigned int level, unsigned int idx)
{
unsigned long i;
unsigned long bidx = 0;
for (i = 0; i < level; i++)
bidx += dir_buckets(i) * bucket_blocks(i);
bidx += idx * bucket_blocks(level);
return bidx;
}
static bool early_match_name(const char *name, int namelen,
f2fs_hash_t namehash, struct f2fs_dir_entry *de)
{
if (le16_to_cpu(de->name_len) != namelen)
return false;
if (de->hash_code != namehash)
return false;
return true;
}
static struct f2fs_dir_entry *find_in_block(struct page *dentry_page,
const char *name, int namelen, int *max_slots,
f2fs_hash_t namehash, struct page **res_page)
{
struct f2fs_dir_entry *de;
unsigned long bit_pos, end_pos, next_pos;
struct f2fs_dentry_block *dentry_blk = kmap(dentry_page);
int slots;
bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK, 0);
while (bit_pos < NR_DENTRY_IN_BLOCK) {
de = &dentry_blk->dentry[bit_pos];
slots = GET_DENTRY_SLOTS(le16_to_cpu(de->name_len));
if (early_match_name(name, namelen, namehash, de)) {
if (!memcmp(dentry_blk->filename[bit_pos],
name, namelen)) {
*res_page = dentry_page;
goto found;
}
}
next_pos = bit_pos + slots;
bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK, next_pos);
if (bit_pos >= NR_DENTRY_IN_BLOCK)
end_pos = NR_DENTRY_IN_BLOCK;
else
end_pos = bit_pos;
if (*max_slots < end_pos - next_pos)
*max_slots = end_pos - next_pos;
}
de = NULL;
kunmap(dentry_page);
found:
return de;
}
static struct f2fs_dir_entry *find_in_level(struct inode *dir,
unsigned int level, const char *name, int namelen,
f2fs_hash_t namehash, struct page **res_page)
{
int s = GET_DENTRY_SLOTS(namelen);
unsigned int nbucket, nblock;
unsigned int bidx, end_block;
struct page *dentry_page;
struct f2fs_dir_entry *de = NULL;
bool room = false;
int max_slots = 0;
BUG_ON(level > MAX_DIR_HASH_DEPTH);
nbucket = dir_buckets(level);
nblock = bucket_blocks(level);
bidx = dir_block_index(level, le32_to_cpu(namehash) % nbucket);
end_block = bidx + nblock;
for (; bidx < end_block; bidx++) {
/* no need to allocate new dentry pages to all the indices */
dentry_page = find_data_page(dir, bidx);
if (IS_ERR(dentry_page)) {
room = true;
continue;
}
de = find_in_block(dentry_page, name, namelen,
&max_slots, namehash, res_page);
if (de)
break;
if (max_slots >= s)
room = true;
f2fs_put_page(dentry_page, 0);
}
if (!de && room && F2FS_I(dir)->chash != namehash) {
F2FS_I(dir)->chash = namehash;
F2FS_I(dir)->clevel = level;
}
return de;
}
/*
* Find an entry in the specified directory with the wanted name.
* It returns the page where the entry was found (as a parameter - res_page),
* and the entry itself. Page is returned mapped and unlocked.
* Entry is guaranteed to be valid.
*/
struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
struct qstr *child, struct page **res_page)
{
const char *name = child->name;
int namelen = child->len;
unsigned long npages = dir_blocks(dir);
struct f2fs_dir_entry *de = NULL;
f2fs_hash_t name_hash;
unsigned int max_depth;
unsigned int level;
if (npages == 0)
return NULL;
*res_page = NULL;
name_hash = f2fs_dentry_hash(name, namelen);
max_depth = F2FS_I(dir)->i_current_depth;
for (level = 0; level < max_depth; level++) {
de = find_in_level(dir, level, name,
namelen, name_hash, res_page);
if (de)
break;
}
if (!de && F2FS_I(dir)->chash != name_hash) {
F2FS_I(dir)->chash = name_hash;
F2FS_I(dir)->clevel = level - 1;
}
return de;
}
struct f2fs_dir_entry *f2fs_parent_dir(struct inode *dir, struct page **p)
{
struct page *page = NULL;
struct f2fs_dir_entry *de = NULL;
struct f2fs_dentry_block *dentry_blk = NULL;
page = get_lock_data_page(dir, 0);
if (IS_ERR(page))
return NULL;
dentry_blk = kmap(page);
de = &dentry_blk->dentry[1];
*p = page;
unlock_page(page);
return de;
}
ino_t f2fs_inode_by_name(struct inode *dir, struct qstr *qstr)
{
ino_t res = 0;
struct f2fs_dir_entry *de;
struct page *page;
de = f2fs_find_entry(dir, qstr, &page);
if (de) {
res = le32_to_cpu(de->ino);
kunmap(page);
f2fs_put_page(page, 0);
}
return res;
}
void f2fs_set_link(struct inode *dir, struct f2fs_dir_entry *de,
struct page *page, struct inode *inode)
{
struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb);
mutex_lock_op(sbi, DENTRY_OPS);
lock_page(page);
wait_on_page_writeback(page);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode);
kunmap(page);
set_page_dirty(page);
dir->i_mtime = dir->i_ctime = CURRENT_TIME;
mark_inode_dirty(dir);
/* update parent inode number before releasing dentry page */
F2FS_I(inode)->i_pino = dir->i_ino;
f2fs_put_page(page, 1);
mutex_unlock_op(sbi, DENTRY_OPS);
}
void init_dent_inode(struct dentry *dentry, struct page *ipage)
{
struct f2fs_node *rn;
if (IS_ERR(ipage))
return;
wait_on_page_writeback(ipage);
/* copy dentry info. to this inode page */
rn = (struct f2fs_node *)page_address(ipage);
rn->i.i_namelen = cpu_to_le32(dentry->d_name.len);
memcpy(rn->i.i_name, dentry->d_name.name, dentry->d_name.len);
set_page_dirty(ipage);
}
static int init_inode_metadata(struct inode *inode, struct dentry *dentry)
{
struct inode *dir = dentry->d_parent->d_inode;
if (is_inode_flag_set(F2FS_I(inode), FI_NEW_INODE)) {
int err;
err = new_inode_page(inode, dentry);
if (err)
return err;
if (S_ISDIR(inode->i_mode)) {
err = f2fs_make_empty(inode, dir);
if (err) {
remove_inode_page(inode);
return err;
}
}
err = f2fs_init_acl(inode, dir);
if (err) {
remove_inode_page(inode);
return err;
}
} else {
struct page *ipage;
ipage = get_node_page(F2FS_SB(dir->i_sb), inode->i_ino);
if (IS_ERR(ipage))
return PTR_ERR(ipage);
init_dent_inode(dentry, ipage);
f2fs_put_page(ipage, 1);
}
if (is_inode_flag_set(F2FS_I(inode), FI_INC_LINK)) {
inc_nlink(inode);
f2fs_write_inode(inode, NULL);
}
return 0;
}
static void update_parent_metadata(struct inode *dir, struct inode *inode,
unsigned int current_depth)
{
bool need_dir_update = false;
if (is_inode_flag_set(F2FS_I(inode), FI_NEW_INODE)) {
if (S_ISDIR(inode->i_mode)) {
inc_nlink(dir);
need_dir_update = true;
}
clear_inode_flag(F2FS_I(inode), FI_NEW_INODE);
}
dir->i_mtime = dir->i_ctime = CURRENT_TIME;
if (F2FS_I(dir)->i_current_depth != current_depth) {
F2FS_I(dir)->i_current_depth = current_depth;
need_dir_update = true;
}
if (need_dir_update)
f2fs_write_inode(dir, NULL);
else
mark_inode_dirty(dir);
if (is_inode_flag_set(F2FS_I(inode), FI_INC_LINK))
clear_inode_flag(F2FS_I(inode), FI_INC_LINK);
}
static int room_for_filename(struct f2fs_dentry_block *dentry_blk, int slots)
{
int bit_start = 0;
int zero_start, zero_end;
next:
zero_start = find_next_zero_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_start);
if (zero_start >= NR_DENTRY_IN_BLOCK)
return NR_DENTRY_IN_BLOCK;
zero_end = find_next_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
zero_start);
if (zero_end - zero_start >= slots)
return zero_start;
bit_start = zero_end + 1;
if (zero_end + 1 >= NR_DENTRY_IN_BLOCK)
return NR_DENTRY_IN_BLOCK;
goto next;
}
int f2fs_add_link(struct dentry *dentry, struct inode *inode)
{
unsigned int bit_pos;
unsigned int level;
unsigned int current_depth;
unsigned long bidx, block;
f2fs_hash_t dentry_hash;
struct f2fs_dir_entry *de;
unsigned int nbucket, nblock;
struct inode *dir = dentry->d_parent->d_inode;
struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb);
const char *name = dentry->d_name.name;
int namelen = dentry->d_name.len;
struct page *dentry_page = NULL;
struct f2fs_dentry_block *dentry_blk = NULL;
int slots = GET_DENTRY_SLOTS(namelen);
int err = 0;
int i;
dentry_hash = f2fs_dentry_hash(name, dentry->d_name.len);
level = 0;
current_depth = F2FS_I(dir)->i_current_depth;
if (F2FS_I(dir)->chash == dentry_hash) {
level = F2FS_I(dir)->clevel;
F2FS_I(dir)->chash = 0;
}
start:
if (current_depth == MAX_DIR_HASH_DEPTH)
return -ENOSPC;
/* Increase the depth, if required */
if (level == current_depth)
++current_depth;
nbucket = dir_buckets(level);
nblock = bucket_blocks(level);
bidx = dir_block_index(level, (le32_to_cpu(dentry_hash) % nbucket));
for (block = bidx; block <= (bidx + nblock - 1); block++) {
mutex_lock_op(sbi, DENTRY_OPS);
dentry_page = get_new_data_page(dir, block, true);
if (IS_ERR(dentry_page)) {
mutex_unlock_op(sbi, DENTRY_OPS);
return PTR_ERR(dentry_page);
}
dentry_blk = kmap(dentry_page);
bit_pos = room_for_filename(dentry_blk, slots);
if (bit_pos < NR_DENTRY_IN_BLOCK)
goto add_dentry;
kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
mutex_unlock_op(sbi, DENTRY_OPS);
}
/* Move to next level to find the empty slot for new dentry */
++level;
goto start;
add_dentry:
err = init_inode_metadata(inode, dentry);
if (err)
goto fail;
wait_on_page_writeback(dentry_page);
de = &dentry_blk->dentry[bit_pos];
de->hash_code = dentry_hash;
de->name_len = cpu_to_le16(namelen);
memcpy(dentry_blk->filename[bit_pos], name, namelen);
de->ino = cpu_to_le32(inode->i_ino);
set_de_type(de, inode);
for (i = 0; i < slots; i++)
test_and_set_bit_le(bit_pos + i, &dentry_blk->dentry_bitmap);
set_page_dirty(dentry_page);
update_parent_metadata(dir, inode, current_depth);
/* update parent inode number before releasing dentry page */
F2FS_I(inode)->i_pino = dir->i_ino;
fail:
kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
mutex_unlock_op(sbi, DENTRY_OPS);
return err;
}
/*
* It only removes the dentry from the dentry page,corresponding name
* entry in name page does not need to be touched during deletion.
*/
void f2fs_delete_entry(struct f2fs_dir_entry *dentry, struct page *page,
struct inode *inode)
{
struct f2fs_dentry_block *dentry_blk;
unsigned int bit_pos;
struct address_space *mapping = page->mapping;
struct inode *dir = mapping->host;
struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb);
int slots = GET_DENTRY_SLOTS(le16_to_cpu(dentry->name_len));
void *kaddr = page_address(page);
int i;
mutex_lock_op(sbi, DENTRY_OPS);
lock_page(page);
wait_on_page_writeback(page);
dentry_blk = (struct f2fs_dentry_block *)kaddr;
bit_pos = dentry - (struct f2fs_dir_entry *)dentry_blk->dentry;
for (i = 0; i < slots; i++)
test_and_clear_bit_le(bit_pos + i, &dentry_blk->dentry_bitmap);
/* Let's check and deallocate this dentry page */
bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
0);
kunmap(page); /* kunmap - pair of f2fs_find_entry */
set_page_dirty(page);
dir->i_ctime = dir->i_mtime = CURRENT_TIME;
if (inode && S_ISDIR(inode->i_mode)) {
drop_nlink(dir);
f2fs_write_inode(dir, NULL);
} else {
mark_inode_dirty(dir);
}
if (inode) {
inode->i_ctime = dir->i_ctime = dir->i_mtime = CURRENT_TIME;
drop_nlink(inode);
if (S_ISDIR(inode->i_mode)) {
drop_nlink(inode);
i_size_write(inode, 0);
}
f2fs_write_inode(inode, NULL);
if (inode->i_nlink == 0)
add_orphan_inode(sbi, inode->i_ino);
}
if (bit_pos == NR_DENTRY_IN_BLOCK) {
truncate_hole(dir, page->index, page->index + 1);
clear_page_dirty_for_io(page);
ClearPageUptodate(page);
dec_page_count(sbi, F2FS_DIRTY_DENTS);
inode_dec_dirty_dents(dir);
}
f2fs_put_page(page, 1);
mutex_unlock_op(sbi, DENTRY_OPS);
}
int f2fs_make_empty(struct inode *inode, struct inode *parent)
{
struct page *dentry_page;
struct f2fs_dentry_block *dentry_blk;
struct f2fs_dir_entry *de;
void *kaddr;
dentry_page = get_new_data_page(inode, 0, true);
if (IS_ERR(dentry_page))
return PTR_ERR(dentry_page);
kaddr = kmap_atomic(dentry_page);
dentry_blk = (struct f2fs_dentry_block *)kaddr;
de = &dentry_blk->dentry[0];
de->name_len = cpu_to_le16(1);
de->hash_code = 0;
de->ino = cpu_to_le32(inode->i_ino);
memcpy(dentry_blk->filename[0], ".", 1);
set_de_type(de, inode);
de = &dentry_blk->dentry[1];
de->hash_code = 0;
de->name_len = cpu_to_le16(2);
de->ino = cpu_to_le32(parent->i_ino);
memcpy(dentry_blk->filename[1], "..", 2);
set_de_type(de, inode);
test_and_set_bit_le(0, &dentry_blk->dentry_bitmap);
test_and_set_bit_le(1, &dentry_blk->dentry_bitmap);
kunmap_atomic(kaddr);
set_page_dirty(dentry_page);
f2fs_put_page(dentry_page, 1);
return 0;
}
bool f2fs_empty_dir(struct inode *dir)
{
unsigned long bidx;
struct page *dentry_page;
unsigned int bit_pos;
struct f2fs_dentry_block *dentry_blk;
unsigned long nblock = dir_blocks(dir);
for (bidx = 0; bidx < nblock; bidx++) {
void *kaddr;
dentry_page = get_lock_data_page(dir, bidx);
if (IS_ERR(dentry_page)) {
if (PTR_ERR(dentry_page) == -ENOENT)
continue;
else
return false;
}
kaddr = kmap_atomic(dentry_page);
dentry_blk = (struct f2fs_dentry_block *)kaddr;
if (bidx == 0)
bit_pos = 2;
else
bit_pos = 0;
bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
kunmap_atomic(kaddr);
f2fs_put_page(dentry_page, 1);
if (bit_pos < NR_DENTRY_IN_BLOCK)
return false;
}
return true;
}
static int f2fs_readdir(struct file *file, void *dirent, filldir_t filldir)
{
unsigned long pos = file->f_pos;
struct inode *inode = file->f_dentry->d_inode;
unsigned long npages = dir_blocks(inode);
unsigned char *types = NULL;
unsigned int bit_pos = 0, start_bit_pos = 0;
int over = 0;
struct f2fs_dentry_block *dentry_blk = NULL;
struct f2fs_dir_entry *de = NULL;
struct page *dentry_page = NULL;
unsigned int n = 0;
unsigned char d_type = DT_UNKNOWN;
int slots;
types = f2fs_filetype_table;
bit_pos = (pos % NR_DENTRY_IN_BLOCK);
n = (pos / NR_DENTRY_IN_BLOCK);
for ( ; n < npages; n++) {
dentry_page = get_lock_data_page(inode, n);
if (IS_ERR(dentry_page))
continue;
start_bit_pos = bit_pos;
dentry_blk = kmap(dentry_page);
while (bit_pos < NR_DENTRY_IN_BLOCK) {
d_type = DT_UNKNOWN;
bit_pos = find_next_bit_le(&dentry_blk->dentry_bitmap,
NR_DENTRY_IN_BLOCK,
bit_pos);
if (bit_pos >= NR_DENTRY_IN_BLOCK)
break;
de = &dentry_blk->dentry[bit_pos];
if (types && de->file_type < F2FS_FT_MAX)
d_type = types[de->file_type];
over = filldir(dirent,
dentry_blk->filename[bit_pos],
le16_to_cpu(de->name_len),
(n * NR_DENTRY_IN_BLOCK) + bit_pos,
le32_to_cpu(de->ino), d_type);
if (over) {
file->f_pos += bit_pos - start_bit_pos;
goto success;
}
slots = GET_DENTRY_SLOTS(le16_to_cpu(de->name_len));
bit_pos += slots;
}
bit_pos = 0;
file->f_pos = (n + 1) * NR_DENTRY_IN_BLOCK;
kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
dentry_page = NULL;
}
success:
if (dentry_page && !IS_ERR(dentry_page)) {
kunmap(dentry_page);
f2fs_put_page(dentry_page, 1);
}
return 0;
}
const struct file_operations f2fs_dir_operations = {
.llseek = generic_file_llseek,
.read = generic_read_dir,
.readdir = f2fs_readdir,
.fsync = f2fs_sync_file,
.unlocked_ioctl = f2fs_ioctl,
};
此差异已折叠。
此差异已折叠。
此差异已折叠。
/*
* fs/f2fs/gc.h
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#define GC_THREAD_NAME "f2fs_gc_task"
#define GC_THREAD_MIN_WB_PAGES 1 /*
* a threshold to determine
* whether IO subsystem is idle
* or not
*/
#define GC_THREAD_MIN_SLEEP_TIME 10000 /* milliseconds */
#define GC_THREAD_MAX_SLEEP_TIME 30000
#define GC_THREAD_NOGC_SLEEP_TIME 10000
#define LIMIT_INVALID_BLOCK 40 /* percentage over total user space */
#define LIMIT_FREE_BLOCK 40 /* percentage over invalid + free space */
/* Search max. number of dirty segments to select a victim segment */
#define MAX_VICTIM_SEARCH 20
enum {
GC_NONE = 0,
GC_ERROR,
GC_OK,
GC_NEXT,
GC_BLOCKED,
GC_DONE,
};
struct f2fs_gc_kthread {
struct task_struct *f2fs_gc_task;
wait_queue_head_t gc_wait_queue_head;
};
struct inode_entry {
struct list_head list;
struct inode *inode;
};
/*
* inline functions
*/
static inline block_t free_user_blocks(struct f2fs_sb_info *sbi)
{
if (free_segments(sbi) < overprovision_segments(sbi))
return 0;
else
return (free_segments(sbi) - overprovision_segments(sbi))
<< sbi->log_blocks_per_seg;
}
static inline block_t limit_invalid_user_blocks(struct f2fs_sb_info *sbi)
{
return (long)(sbi->user_block_count * LIMIT_INVALID_BLOCK) / 100;
}
static inline block_t limit_free_user_blocks(struct f2fs_sb_info *sbi)
{
block_t reclaimable_user_blocks = sbi->user_block_count -
written_block_count(sbi);
return (long)(reclaimable_user_blocks * LIMIT_FREE_BLOCK) / 100;
}
static inline long increase_sleep_time(long wait)
{
wait += GC_THREAD_MIN_SLEEP_TIME;
if (wait > GC_THREAD_MAX_SLEEP_TIME)
wait = GC_THREAD_MAX_SLEEP_TIME;
return wait;
}
static inline long decrease_sleep_time(long wait)
{
wait -= GC_THREAD_MIN_SLEEP_TIME;
if (wait <= GC_THREAD_MIN_SLEEP_TIME)
wait = GC_THREAD_MIN_SLEEP_TIME;
return wait;
}
static inline bool has_enough_invalid_blocks(struct f2fs_sb_info *sbi)
{
block_t invalid_user_blocks = sbi->user_block_count -
written_block_count(sbi);
/*
* Background GC is triggered with the following condition.
* 1. There are a number of invalid blocks.
* 2. There is not enough free space.
*/
if (invalid_user_blocks > limit_invalid_user_blocks(sbi) &&
free_user_blocks(sbi) < limit_free_user_blocks(sbi))
return true;
return false;
}
static inline int is_idle(struct f2fs_sb_info *sbi)
{
struct block_device *bdev = sbi->sb->s_bdev;
struct request_queue *q = bdev_get_queue(bdev);
struct request_list *rl = &q->root_rl;
return !(rl->count[BLK_RW_SYNC]) && !(rl->count[BLK_RW_ASYNC]);
}
static inline bool should_do_checkpoint(struct f2fs_sb_info *sbi)
{
unsigned int pages_per_sec = sbi->segs_per_sec *
(1 << sbi->log_blocks_per_seg);
int node_secs = ((get_pages(sbi, F2FS_DIRTY_NODES) + pages_per_sec - 1)
>> sbi->log_blocks_per_seg) / sbi->segs_per_sec;
int dent_secs = ((get_pages(sbi, F2FS_DIRTY_DENTS) + pages_per_sec - 1)
>> sbi->log_blocks_per_seg) / sbi->segs_per_sec;
return free_sections(sbi) <= (node_secs + 2 * dent_secs + 2);
}
/*
* fs/f2fs/hash.c
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* Portions of this code from linux/fs/ext3/hash.c
*
* Copyright (C) 2002 by Theodore Ts'o
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include <linux/types.h>
#include <linux/fs.h>
#include <linux/f2fs_fs.h>
#include <linux/cryptohash.h>
#include <linux/pagemap.h>
#include "f2fs.h"
/*
* Hashing code copied from ext3
*/
#define DELTA 0x9E3779B9
static void TEA_transform(unsigned int buf[4], unsigned int const in[])
{
__u32 sum = 0;
__u32 b0 = buf[0], b1 = buf[1];
__u32 a = in[0], b = in[1], c = in[2], d = in[3];
int n = 16;
do {
sum += DELTA;
b0 += ((b1 << 4)+a) ^ (b1+sum) ^ ((b1 >> 5)+b);
b1 += ((b0 << 4)+c) ^ (b0+sum) ^ ((b0 >> 5)+d);
} while (--n);
buf[0] += b0;
buf[1] += b1;
}
static void str2hashbuf(const char *msg, int len, unsigned int *buf, int num)
{
unsigned pad, val;
int i;
pad = (__u32)len | ((__u32)len << 8);
pad |= pad << 16;
val = pad;
if (len > num * 4)
len = num * 4;
for (i = 0; i < len; i++) {
if ((i % 4) == 0)
val = pad;
val = msg[i] + (val << 8);
if ((i % 4) == 3) {
*buf++ = val;
val = pad;
num--;
}
}
if (--num >= 0)
*buf++ = val;
while (--num >= 0)
*buf++ = pad;
}
f2fs_hash_t f2fs_dentry_hash(const char *name, int len)
{
__u32 hash, minor_hash;
f2fs_hash_t f2fs_hash;
const char *p;
__u32 in[8], buf[4];
/* Initialize the default seed for the hash checksum functions */
buf[0] = 0x67452301;
buf[1] = 0xefcdab89;
buf[2] = 0x98badcfe;
buf[3] = 0x10325476;
p = name;
while (len > 0) {
str2hashbuf(p, len, in, 4);
TEA_transform(buf, in);
len -= 16;
p += 16;
}
hash = buf[0];
minor_hash = buf[1];
f2fs_hash = cpu_to_le32(hash & ~F2FS_HASH_COL_BIT);
return f2fs_hash;
}
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
此差异已折叠。
/*
* fs/f2fs/xattr.h
*
* Copyright (c) 2012 Samsung Electronics Co., Ltd.
* http://www.samsung.com/
*
* Portions of this code from linux/fs/ext2/xattr.h
*
* On-disk format of extended attributes for the ext2 filesystem.
*
* (C) 2001 Andreas Gruenbacher, <a.gruenbacher@computer.org>
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#ifndef __F2FS_XATTR_H__
#define __F2FS_XATTR_H__
#include <linux/init.h>
#include <linux/xattr.h>
/* Magic value in attribute blocks */
#define F2FS_XATTR_MAGIC 0xF2F52011
/* Maximum number of references to one attribute block */
#define F2FS_XATTR_REFCOUNT_MAX 1024
/* Name indexes */
#define F2FS_SYSTEM_ADVISE_PREFIX "system.advise"
#define F2FS_XATTR_INDEX_USER 1
#define F2FS_XATTR_INDEX_POSIX_ACL_ACCESS 2
#define F2FS_XATTR_INDEX_POSIX_ACL_DEFAULT 3
#define F2FS_XATTR_INDEX_TRUSTED 4
#define F2FS_XATTR_INDEX_LUSTRE 5
#define F2FS_XATTR_INDEX_SECURITY 6
#define F2FS_XATTR_INDEX_ADVISE 7
struct f2fs_xattr_header {
__le32 h_magic; /* magic number for identification */
__le32 h_refcount; /* reference count */
__u32 h_reserved[4]; /* zero right now */
};
struct f2fs_xattr_entry {
__u8 e_name_index;
__u8 e_name_len;
__le16 e_value_size; /* size of attribute value */
char e_name[0]; /* attribute name */
};
#define XATTR_HDR(ptr) ((struct f2fs_xattr_header *)(ptr))
#define XATTR_ENTRY(ptr) ((struct f2fs_xattr_entry *)(ptr))
#define XATTR_FIRST_ENTRY(ptr) (XATTR_ENTRY(XATTR_HDR(ptr)+1))
#define XATTR_ROUND (3)
#define XATTR_ALIGN(size) ((size + XATTR_ROUND) & ~XATTR_ROUND)
#define ENTRY_SIZE(entry) (XATTR_ALIGN(sizeof(struct f2fs_xattr_entry) + \
entry->e_name_len + le16_to_cpu(entry->e_value_size)))
#define XATTR_NEXT_ENTRY(entry) ((struct f2fs_xattr_entry *)((char *)(entry) +\
ENTRY_SIZE(entry)))
#define IS_XATTR_LAST_ENTRY(entry) (*(__u32 *)(entry) == 0)
#define list_for_each_xattr(entry, addr) \
for (entry = XATTR_FIRST_ENTRY(addr);\
!IS_XATTR_LAST_ENTRY(entry);\
entry = XATTR_NEXT_ENTRY(entry))
#define MIN_OFFSET XATTR_ALIGN(PAGE_SIZE - \
sizeof(struct node_footer) - \
sizeof(__u32))
#define MAX_VALUE_LEN (MIN_OFFSET - sizeof(struct f2fs_xattr_header) - \
sizeof(struct f2fs_xattr_entry))
/*
* On-disk structure of f2fs_xattr
* We use only 1 block for xattr.
*
* +--------------------+
* | f2fs_xattr_header |
* | |
* +--------------------+
* | f2fs_xattr_entry |
* | .e_name_index = 1 |
* | .e_name_len = 3 |
* | .e_value_size = 14 |
* | .e_name = "foo" |
* | "value_of_xattr" |<- value_offs = e_name + e_name_len
* +--------------------+
* | f2fs_xattr_entry |
* | .e_name_index = 4 |
* | .e_name = "bar" |
* +--------------------+
* | |
* | Free |
* | |
* +--------------------+<- MIN_OFFSET
* | node_footer |
* | (nid, ino, offset) |
* +--------------------+
*
**/
#ifdef CONFIG_F2FS_FS_XATTR
extern const struct xattr_handler f2fs_xattr_user_handler;
extern const struct xattr_handler f2fs_xattr_trusted_handler;
extern const struct xattr_handler f2fs_xattr_acl_access_handler;
extern const struct xattr_handler f2fs_xattr_acl_default_handler;
extern const struct xattr_handler f2fs_xattr_advise_handler;
extern const struct xattr_handler *f2fs_xattr_handlers[];
extern int f2fs_setxattr(struct inode *inode, int name_index, const char *name,
const void *value, size_t value_len);
extern int f2fs_getxattr(struct inode *inode, int name_index, const char *name,
void *buffer, size_t buffer_size);
extern ssize_t f2fs_listxattr(struct dentry *dentry, char *buffer,
size_t buffer_size);
#else
#define f2fs_xattr_handlers NULL
static inline int f2fs_setxattr(struct inode *inode, int name_index,
const char *name, const void *value, size_t value_len)
{
return -EOPNOTSUPP;
}
static inline int f2fs_getxattr(struct inode *inode, int name_index,
const char *name, void *buffer, size_t buffer_size)
{
return -EOPNOTSUPP;
}
static inline ssize_t f2fs_listxattr(struct dentry *dentry, char *buffer,
size_t buffer_size)
{
return -EOPNOTSUPP;
}
#endif
#endif /* __F2FS_XATTR_H__ */
此差异已折叠。
......@@ -23,6 +23,7 @@
#define EXT4_SUPER_MAGIC 0xEF53
#define BTRFS_SUPER_MAGIC 0x9123683E
#define NILFS_SUPER_MAGIC 0x3434
#define F2FS_SUPER_MAGIC 0xF2F52010
#define HPFS_SUPER_MAGIC 0xf995e849
#define ISOFS_SUPER_MAGIC 0x9660
#define JFFS2_SUPER_MAGIC 0x72b6
......
Markdown is supported
0% .
You are about to add 0 people to the discussion. Proceed with caution.
先完成此消息的编辑!
想要评论请 注册