Commit Graph

179 Commits

Author SHA1 Message Date
Dave Chinner
9bb54cb56a xfs: clean up MIN/MAX
Get rid of the MIN/MAX macros and just use the native min/max macros
directly in the XFS code.

Signed-Off-By: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-08 10:07:52 -07:00
Dave Chinner
0b61f8a407 xfs: convert to SPDX license tags
Remove the verbose license text from XFS files and replace them
with SPDX tags. This does not change the license of any of the code,
merely refers to the common, up-to-date license files in LICENSES/

This change was mostly scripted. fs/xfs/Makefile and
fs/xfs/libxfs/xfs_fs.h were modified by hand, the rest were detected
and modified by the following command:

for f in `git grep -l "GNU General" fs/xfs/` ; do
	echo $f
	cat $f | awk -f hdr.awk > $f.new
	mv -f $f.new $f
done

And the hdr.awk script that did the modification (including
detecting the difference between GPL-2.0 and GPL-2.0+ licenses)
is as follows:

$ cat hdr.awk
BEGIN {
	hdr = 1.0
	tag = "GPL-2.0"
	str = ""
}

/^ \* This program is free software/ {
	hdr = 2.0;
	next
}

/any later version./ {
	tag = "GPL-2.0+"
	next
}

/^ \*\// {
	if (hdr > 0.0) {
		print "// SPDX-License-Identifier: " tag
		print str
		print $0
		str=""
		hdr = 0.0
		next
	}
	print $0
	next
}

/^ \* / {
	if (hdr > 1.0)
		next
	if (hdr > 0.0) {
		if (str != "")
			str = str "\n"
		str = str $0
		next
	}
	print $0
	next
}

/^ \*/ {
	if (hdr > 0.0)
		next
	print $0
	next
}

// {
	if (hdr > 0.0) {
		if (str != "")
			str = str "\n"
		str = str $0
		next
	}
	print $0
}

END { }
$

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2018-06-06 14:17:53 -07:00
Christoph Hellwig
66f364649d xfs: remove if_rdev
We can simply use the i_rdev field in the Linux inode and just convert
to and from the XFS dev_t when reading or logging/writing the inode.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26 15:38:27 -07:00
Christoph Hellwig
42b67dc6ff xfs: remove the never fully implemented UUID fork format
Remove the dead code dealing with the UUID fork format that was never
implemented in Linux (and neither in IRIX as far as I know).

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2017-10-26 15:38:27 -07:00
Darrick J. Wong
91fb9afc08 xfs: create inode pointer verifiers
Create some helper functions to check that inode pointers point to
somewhere within the filesystem and not at the static AG metadata.
Move xfs_internal_inum and create a directory inode check function.
We will use these functions in scrub and elsewhere.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2017-10-26 15:38:23 -07:00
Darrick J. Wong
2678809799 xfs: export various function for the online scrubber
Export various internal functions so that the online scrubber can use
them to check the state of metadata.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2017-06-19 14:11:34 -07:00
Darrick J. Wong
bf9216f922 xfs: fix kernel memory exposure problems
Fix a memory exposure problems in inumbers where we allocate an array of
structures with holes, fail to zero the holes, then blindly copy the
kernel memory contents (junk and all) into userspace.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2017-04-03 12:22:39 -07:00
Darrick J. Wong
08b005f133 xfs: remove kmem_zalloc_greedy
The sole remaining caller of kmem_zalloc_greedy is bulkstat, which uses
it to grab 1-4 pages for staging of inobt records.  The infinite loop in
the greedy allocation function is causing hangs[1] in generic/269, so
just get rid of the greedy allocator in favor of kmem_zalloc_large.
This makes bulkstat somewhat more likely to ENOMEM if there's really no
pages to spare, but eliminates a source of hangs.

[1] http://lkml.kernel.org/r/20170301044634.rgidgdqqiiwsmfpj%40XZHOUW.usersys.redhat.com

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
---
v2: remove single-page fallback
2017-03-07 20:10:50 -08:00
Darrick J. Wong
f7ca352272 xfs: create a separate cow extent size hint for the allocator
Create a per-inode extent size allocator hint for copy-on-write.  This
hint is separate from the existing extent size hint so that CoW can
take advantage of the fragmentation-reducing properties of extent size
hints without disabling delalloc for regular writes.

The extent size hint that's fed to the allocator during a copy on
write operation is the greater of the cowextsize and regular extsize
hint.

During reflink, if we're sharing the entire source file to the entire
destination file and the destination file doesn't already have a
cowextsize hint, propagate the source file's cowextsize hint to the
destination file.

Furthermore, zero the bulkstat buffer prior to setting the fields
so that we don't copy kernel memory contents into userspace.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2016-10-05 16:26:26 -07:00
Dave Chinner
c19b3b05ae xfs: mode di_mode to vfs inode
Move the di_mode value from the xfs_icdinode to the VFS inode, reducing
the xfs_icdinode byte another 2 bytes and collapsing another 2 byte hole
in the structure.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09 16:54:58 +11:00
Dave Chinner
9e9a2674e4 xfs: move inode generation count to VFS inode
Pull another 4 bytes out of the xfs_icdinode.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09 16:54:58 +11:00
Dave Chinner
54d7b5c1d0 xfs: use vfs inode nlink field everywhere
The VFS tracks the inode nlink just like the xfs_icdinode. We can
remove the variable from the icdinode and use the VFS inode variable
everywhere, reducing the size of the xfs_icdinode by a further 4
bytes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09 16:54:58 +11:00
Dave Chinner
3987848c7c xfs: remove timestamps from incore inode
The struct xfs_inode has two copies of the current timestamps in it,
one in the vfs inode and one in the struct xfs_icdinode. Now that we
no longer log the struct xfs_icdinode directly, we don't need to
keep the timestamps in this structure. instead we can copy them
straight out of the VFS inode when formatting the inode log item or
the on-disk inode.

This reduces the struct xfs_inode in size by 24 bytes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-02-09 16:54:58 +11:00
Brian Foster
f307080a62 xfs: fix btree cursor error cleanups
The btree cursor cleanup function takes an error parameter that
affects how buffers are released from the cursor. All buffers are
released in the event of error. Several callers do not specify the
XFS_BTREE_ERROR flag in the event of error, however. This can cause
buffers to hang around locked or with an elevated hold count and
thus lead to umount hangs in the event of errors.

Fix up the xfs_btree_del_cursor() callers to pass XFS_BTREE_ERROR if
the cursor is being torn down due to error.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-08-19 10:00:53 +10:00
Brian Foster
12d0714d4b xfs: use actual inode count for sparse records in bulkstat/inumbers
The bulkstat and inumbers mechanisms make the assumption that inode
records consist of a full 64 inode chunk in several places. For example,
this is used to track how many inodes have been processed overall as
well as to determine whether a record has allocated inodes that must be
handled.

This assumption is invalid for sparse inode records. While sparse inodes
will be marked as free in the ir_free mask, they are not accounted as
free in ir_freecount because they cannot be allocated. Therefore,
ir_freecount may be less than 64 inodes in an inode record for which all
physically allocated inodes are free (and in turn ir_freecount < 64 does
not signify that the record has allocated inodes).

The new in-core inobt record format includes the ir_count field. This
holds the number of true, physical inodes tracked by the record. The
in-core ir_count field is always valid as it is hardcoded to
XFS_INODES_PER_CHUNK when sparse inodes is not enabled. Use ir_count to
handle inode records correctly in bulkstat in a generic manner.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-05-29 09:04:19 +10:00
Eric Sandeen
5fb5aeeeb6 xfs: pass mp to XFS_WANT_CORRUPTED_RETURN
Today, if we hit an XFS_WANT_CORRUPTED_RETURN we don't print any
information about which filesystem hit it.  Passing in the mp allows
us to print the filesystem (device) name, which is a pretty critical
piece of information.

Tested by running fsfuzzer 'til I hit some.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 22:39:13 +11:00
Dave Chinner
6044e4386c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next
Conflicts:
	fs/xfs/xfs_iops.c
2014-12-04 09:46:17 +11:00
Dave Chinner
32296f865e xfs: fix set-but-unused warnings
The kernel compile doesn't turn on these checks by default, so it's
only when I do a kernel-user sync that I find that there are lots of
compiler warnings waiting to be fixed. Fix up these set-but-unused
warnings.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:43:17 +11:00
Dave Chinner
216875a594 Merge branch 'xfs-consolidate-format-defs' into for-next 2014-11-28 14:52:16 +11:00
Christoph Hellwig
508b6b3b73 xfs: merge xfs_inum.h into xfs_format.h
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:27:10 +11:00
Christoph Hellwig
bb58e6188a xfs: move most of xfs_sb.h to xfs_format.h
More on-disk format consolidation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:27:09 +11:00
Christoph Hellwig
4fb6e8ade2 xfs: merge xfs_ag.h into xfs_format.h
More on-disk format consolidation.  A few declarations that weren't on-disk
format related move into better suitable spots.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:25:04 +11:00
Christoph Hellwig
6d3ebaae7c xfs: merge xfs_dinode.h into xfs_format.h
More consolidatation for the on-disk format defintions.  Note that the
XFS_IS_REALTIME_INODE moves to xfs_linux.h instead as it is not related
to the on disk format, but depends on a CONFIG_ option.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:24:06 +11:00
Dave Chinner
0027589926 xfs: track bulkstat progress by agino
The bulkstat main loop progress is tracked by the "lastino"
variable, which is a full 64 bit inode. However, the loop actually
works on agno/agino pairs, and so there's a significant disconnect
between the rest of the loop and the main cursor. Convert this to
use the agino, and pass the agino into the chunk formatting function
and convert it too.

This gets rid of the inconsistency in the loop processing, and
finally makes it simple for us to skip inodes at any point in the
loop simply by incrementing the agino cursor.

cc: <stable@vger.kernel.org> # 3.17
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-07 08:33:52 +11:00
Dave Chinner
febe3cbe38 xfs: bulkstat error handling is broken
The error propagation is a horror - xfs_bulkstat() returns
a rval variable which is only set if there are formatter errors. Any
sort of btree walk error or corruption will cause the bulkstat walk
to terminate but will not pass an error back to userspace. Worse
is the fact that formatter errors will also be ignored if any inodes
were correctly formatted into the user buffer.

Hence bulkstat can fail badly yet still report success to userspace.
This causes significant issues with xfsdump not dumping everything
in the filesystem yet reporting success. It's not until a restore
fails that there is any indication that the dump was bad and tha
bulkstat failed. This patch now triggers xfsdump to fail with
bulkstat errors rather than silently missing files in the dump.

This now causes bulkstat to fail when the lastino cookie does not
fall inside an existing inode chunk. The pre-3.17 code tolerated
that error by allowing the code to move to the next inode chunk
as the agino target is guaranteed to fall into the next btree
record.

With the fixes up to this point in the series, xfsdump now passes on
the troublesome filesystem image that exposes all these bugs.

cc: <stable@vger.kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
2014-11-07 08:31:15 +11:00
Dave Chinner
6e57c542cb xfs: bulkstat main loop logic is a mess
There are a bunch of variables tha tare more wildy scoped than they
need to be, obfuscated user buffer checks and tortured "next inode"
tracking. This all needs cleaning up to expose the real issues that
need fixing.

cc: <stable@vger.kernel.org> # 3.17
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-07 08:31:13 +11:00
Dave Chinner
2b831ac6bc xfs: bulkstat chunk-formatter has issues
The loop construct has issues:
	- clustidx is completely unused, so remove it.
	- the loop tries to be smart by terminating when the
	  "freecount" tells it that all inodes are free. Just drop
	  it as in most cases we have to scan all inodes in the
	  chunk anyway.
	- move the "user buffer left" condition check to the only
	  point where we consume space int eh user buffer.
	- move the initialisation of agino out of the loop, leaving
	  just a simple loop control logic using the clusteridx.

Also, double handling of the user buffer variables leads to problems
tracking the current state - use the cursor variables directly
rather than keeping local copies and then having to update the
cursor before returning.

cc: <stable@vger.kernel.org> # 3.17
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-07 08:30:58 +11:00
Dave Chinner
bf4a5af20d xfs: bulkstat chunk formatting cursor is broken
The xfs_bulkstat_agichunk formatting cursor takes buffer values from
the main loop and passes them via the structure to the chunk
formatter, and the writes the changed values back into the main loop
local variables. Unfortunately, this complex dance is full of corner
cases that aren't handled correctly.

The biggest problem is that it is double handling the information in
both the main loop and the chunk formatting function, leading to
inconsistent updates and endless loops where progress is not made.

To fix this, push the struct xfs_bulkstat_agichunk outwards to be
the primary holder of user buffer information. this removes the
double handling in the main loop.

Also, pass the last inode processed by the chunk formatter as a
separate parameter as it purely an output variable and is not
related to the user buffer consumption cursor.

Finally, the chunk formatting code is not shared by anyone, so make
it local to xfs_itable.c.

cc: <stable@vger.kernel.org> # 3.17
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-07 08:30:30 +11:00
Dave Chinner
afa947cb52 xfs: bulkstat btree walk doesn't terminate
The bulkstat code has several different ways of detecting the end of
an AG when doing a walk. They are not consistently detected, and the
code that checks for the end of AG conditions is not consistently
coded. Hence the are conditions where the walk code can get stuck in
an endless loop making no progress and not triggering any
termination conditions.

Convert all the "tmp/i" status return codes from btree operations
to a common name (stat) and apply end-of-ag detection to these
operations consistently.

cc: <stable@vger.kernel.org> # 3.17
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-07 08:29:57 +11:00
Jan Kara
7a19dee116 xfs: Check error during inode btree iteration in xfs_bulkstat()
xfs_bulkstat() doesn't check error return from xfs_btree_increment(). In
case of specific fs corruption that could result in xfs_bulkstat()
entering an infinite loop because we would be looping over the same
chunk over and over again. Fix the problem by checking the return value
and terminating the loop properly.

Coverity-id: 1231338
cc: <stable@vger.kernel.org>
Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Jie Liu <jeff.u.liu@gmail.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-30 10:34:52 +11:00
Dave Chinner
a6bbce54ef xfs: bulkstat doesn't release AGI buffer on error
The recent refactoring of the bulkstat code left a small landmine in
the code. If a inobt read fails, then the tree walk is aborted and
returns without releasing the AGI buffer or freeing the cursor. This
can lead to a subsequent bulkstat call hanging trying to grab the
AGI buffer again.

cc: <stable@vger.kernel.org>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-29 08:22:18 +11:00
Eric Sandeen
a8b1ee8baf xfs: fix agno increment in xfs_inumbers() loop
caused a regression in xfs_inumbers, which in turn broke
xfsdump, causing incomplete dumps.

The loop in xfs_inumbers() needs to fill the user-supplied
buffers, and iterates via xfs_btree_increment, reading new
ags as needed.

But the first time through the loop, if xfs_btree_increment()
succeeds, we continue, which triggers the ++agno at the bottom
of the loop, and we skip to soon to the next ag - without
the proper setup under next_ag to read the next ag.

Fix this by removing the agno increment from the loop conditional,
and only increment agno if we have actually hit the code under
the next_ag: target.

Cc: stable@vger.kernel.org
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-13 10:21:53 +11:00
Jie Liu
1e773c4989 xfs: introduce xfs_bulkstat_ag_ichunk
Introduce xfs_bulkstat_ag_ichunk() to process inodes in chunk with a
pointer to a formatter function that will iget the inode and fill in
the appropriate structure.

Refactor xfs_bulkstat() with it.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-08-04 11:22:31 +10:00
Jie Liu
f3d1e58743 xfs: introduce xfs_bulkstat_grab_ichunk
From: Jie Liu <jeff.liu@oracle.com>

Introduce xfs_bulkstat_grab_ichunk() to look up an inode chunk in where
the given inode resides, then grab the record.  Update the data for the
pointed-to record if the inode was not the last in the chunk and there
are some left allocated, return the grabbed inode count on success.

Refactor xfs_bulkstat() with it.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 18:42:21 +10:00
Jie Liu
4b8fdfecd8 xfs: introduce xfs_bulkstat_ichunk_ra
From: Jie Liu <jeff.liu@oracle.com>

Introduce xfs_bulkstat_ichunk_ra() to loop over all clusters in the
next inode chunk, then performs readahead if there are any allocated
inodes in that cluster.

Refactor xfs_bulkstat() with it.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 18:41:18 +10:00
Jie Liu
d4c2734875 xfs: fix error handling at xfs_bulkstat
From: Jie Liu <jeff.liu@oracle.com>

We should not ignore the btree operation errors at xfs_bulkstat() but
to propagate them if any. This patch fix two places in this function
and the remaining things will be fixed with code refactoring thereafter.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 18:40:43 +10:00
Jie Liu
296dfd7fdb xfs: remove redundant user buffer count checks at xfs_bulkstat
From: Jie Liu <jeff.liu@oracle.com>

Remove the redundant user buffer and count checks as it has already
been validated at xfs_ioc_bulkstat().

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 18:40:26 +10:00
Jie Liu
c7cb51dcb0 xfs: fix error handling at xfs_inumbers
From: Jie Liu <jeff.liu@oracle.com>

To fetch the file system number tables, we currently just ignore the
errors and proceed to loop over the next AG or bump agino to the next
chunk in case of btree operations failed, that is not properly because
those errors might hint us potential file system problems.

This patch rework xfs_inumbers() to handle the btree operation errors
as well as the loop conditions.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 12:18:47 +10:00
Jie Liu
549fa00679 xfs: consolidate xfs_inumbers
From: Jie Liu <jeff.liu@oracle.com>

Consolidate xfs_inumbers() to make the formatter function return correct
error and make the source code looks a bit neat.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 12:11:47 +10:00
Christoph Hellwig
d716f8eedb xfs: remove xfs_bulkstat_single
From: Christoph Hellwig <hch@lst.de>

xfs_bukstat_one doesn't have any failure case that would go away when
called through xfs_bulkstat, so remove the fallback and the now unessecary
xfs_bulkstat_single function.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 12:07:15 +10:00
Jie Liu
8fe657760d xfs: remove redundant stat assignment in xfs_bulkstat_one_int
From: Jie Liu <jeff.liu@oracle.com>

Remove the redundant BULKSTAT_RV_NOTHING assignment in case of call
xfs_iget() failed at xfs_bulkstat_one_int().

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-07-24 11:33:28 +10:00
Dave Chinner
2451337dd0 xfs: global error sign conversion
Convert all the errors the core XFs code to negative error signs
like the rest of the kernel and remove all the sign conversion we
do in the interface layers.

Errors for conversion (and comparison) found via searches like:

$ git grep " E" fs/xfs
$ git grep "return E" fs/xfs
$ git grep " E[A-Z].*;$" fs/xfs

Negation points found via searches like:

$ git grep "= -[a-z,A-Z]" fs/xfs
$ git grep "return -[a-z,A-D,F-Z]" fs/xfs
$ git grep " -[a-z].*;" fs/xfs

[ with some bits I missed from Brian Foster ]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-25 14:58:08 +10:00
Eric Sandeen
b474c7ae43 xfs: Nuke XFS_ERROR macro
XFS_ERROR was designed long ago to trap return values, but it's not
runtime configurable, it's not consistently used, and we can do
similar error trapping with ftrace scripts and triggers from
userspace.

Just nuke XFS_ERROR and associated bits.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-06-22 15:04:54 +10:00
Brian Foster
57bd3dbe40 xfs: refactor xfs_ialloc_btree.c to support multiple inobt numbers
The introduction of the free inode btree (finobt) requires that
xfs_ialloc_btree.c handle multiple trees. Refactor xfs_ialloc_btree.c
so the caller specifies the btree type on cursor initialization to
prepare for addition of the finobt.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-04-24 16:00:50 +10:00
Jie Liu
a2ba07b2d2 xfs: use xfs_icluster_size_fsb in xfs_bulkstat
Use xfs_icluster_size_fsb() in xfs_bulkstat(), make the related
variables more meaningful and remove an unused variable nimask
from it.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2013-12-13 15:51:48 +11:00
Jie Liu
0f49efd805 xfs: get rid of XFS_INODE_CLUSTER_SIZE macros
Get rid of XFS_INODE_CLUSTER_SIZE() macros, use mp->m_inode_cluster_size
directly.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2013-12-13 15:51:48 +11:00
Dave Chinner
a4fbe6ab1e xfs: decouple inode and bmap btree header files
Currently the xfs_inode.h header has a dependency on the definition
of the BMAP btree records as the inode fork includes an array of
xfs_bmbt_rec_host_t objects in it's definition.

Move all the btree format definitions from xfs_btree.h,
xfs_bmap_btree.h, xfs_alloc_btree.h and xfs_ialloc_btree.h to
xfs_format.h to continue the process of centralising the on-disk
format definitions. With this done, the xfs inode definitions are no
longer dependent on btree header files.

The enables a massive culling of unnecessary includes, with close to
200 #include directives removed from the XFS kernel code base.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-23 16:28:49 -05:00
Dave Chinner
239880ef64 xfs: decouple log and transaction headers
xfs_trans.h has a dependency on xfs_log.h for a couple of
structures. Most code that does transactions doesn't need to know
anything about the log, but this dependency means that they have to
include xfs_log.h. Decouple the xfs_trans.h and xfs_log.h header
files and clean up the includes to be in dependency order.

In doing this, remove the direct include of xfs_trans_reserve.h from
xfs_trans.h so that we remove the dependency between xfs_trans.h and
xfs_mount.h. Hence the xfs_trans.h include can be moved to the
indicate the actual dependencies other header files have on it.

Note that these are kernel only header files, so this does not
translate to any userspace changes at all.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-23 16:17:44 -05:00
Dave Chinner
70a9883c5f xfs: create a shared header file for format-related information
All of the buffer operations structures are needed to be exported
for xfs_db, so move them all to a common location rather than
spreading them all over the place. They are verifying the on-disk
format, so while xfs_format.h might be a good place, it is not part
of the on disk format.

Hence we need to create a new header file that we centralise these
related definitions. Start by moving the bffer operations
structures, and then also move all the other definitions that have
crept into xfs_log_format.h and xfs_format.h as there was no other
shared header file to put them in.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-10-23 14:11:30 -05:00
Dave Chinner
fdd3cceef4 xfs: factor all the kmalloc-or-vmalloc fallback allocations
We have quite a few places now where we do:

	x = kmem_zalloc(large size)
	if (!x)
		x = kmem_zalloc_large(large size)

and do a similar dance when freeing the memory. kmem_free() already
does the correct freeing dance, and kmem_zalloc_large() is only ever
called in these constructs, so just factor it all into
kmem_zalloc_large() and kmem_free().

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-10 13:57:03 -05:00
Dave Chinner
a30b036797 xfs: fix some minor sparse warnings
A couple of simple locking annotations and 0 vs NULL warnings.
Nothing that changes any code behaviour, just removes build noise.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-09-09 17:43:05 -05:00
Jie Liu
9cee4c5b7b xfs: clean up unused codes at xfs_bulkstat()
There are some unused codes at xfs_bulkstat():

- Variable bp is defined to point to the on-disk inode cluster
  buffer, but it proved to be of no practical help.

- We process the chunks of good inodes which were fetched by iterating
  btree records from an AG.  When processing inodes from each chunk,
  the code recomputing agbno if run into the first inode of a cluster,
  however, the agbno is not being used thereafter.

This patch tries to clean up those things.

Signed-off-by: Jie Liu <jeff.liu@oracle.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-07-09 15:36:21 -05:00
Chandra Seetharaman
9cad19d2cb xfs: Define a new function xfs_is_quota_inode()
In preparation for combined pquota/gquota support, define
a new function to check if the given inode is a quota inode.

Signed-off-by: Chandra Seetharaman <sekharan@us.ibm.com>
Reviewed-by: Ben Myers <bpm@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-06-28 13:03:49 -05:00
Dave Chinner
cbb2864aa4 xfs: add pluging for bulkstat readahead
I was running some tests on bulkstat on CRC enabled filesystems when
I noticed that all the IO being issued was 8k in size, regardless of
the fact taht we are issuing sequential 8k buffers for inodes
clusters. The IO size should be 16k for 256 byte inodes, and 32k for
512 byte inodes, but this wasn't happening.

blktrace showed that there was an explict plug and unplug happening
around each readahead IO from _xfs_buf_ioapply, and the unplug was
causing the IO to be issued immediately. Hence no opportunity was
being given to the elevator to merge adjacent readahead requests and
dispatch them as a single IO.

Add plugging around the inode chunk readahead dispatch loop in
bulkstat to ensure that we don't unplug the queue between adjacent
inode buffer readahead IOs and so we get fewer, larger IO requests
hitting the storage subsystem for bulkstat.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2013-06-27 13:26:23 -05:00
Dave Chinner
1813dd6405 xfs: convert buffer verifiers to an ops structure.
To separate the verifiers from iodone functions and associate read
and write verifiers at the same time, introduce a buffer verifier
operations structure to the xfs_buf.

This avoids the need for assigning the write verifier, clearing the
iodone function and re-running ioend processing in the read
verifier, and gets rid of the nasty "b_pre_io" name for the write
verifier function pointer. If we ever need to, it will also be
easier to add further content specific callbacks to a buffer with an
ops structure in place.

We also avoid needing to export verifier functions, instead we
can simply export the ops structures for those that are needed
outside the function they are defined in.

This patch also fixes a directory block readahead verifier issue
it exposed.

This patch also adds ops callbacks to the inode/alloc btree blocks
initialised by growfs. These will need more work before they will
work with CRCs.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Phil White <pwhite@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-11-15 21:35:12 -06:00
Dave Chinner
612cfbfe17 xfs: add pre-write metadata buffer verifier callbacks
These verifiers are essentially the same code as the read verifiers,
but do not require ioend processing. Hence factor the read verifier
functions and add a new write verifier wrapper that is used as the
callback.

This is done as one large patch for all verifiers rather than one
patch per verifier as the change is largely mechanical. This
includes hooking up the write verifier via the read verifier
function.

Hooking up the write verifier for buffers obtained via
xfs_trans_get_buf() will be done in a separate patch as that touches
code in many different places rather than just the verifier
functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-11-15 21:35:02 -06:00
Dave Chinner
3d3e6f64e2 xfs: verify btree blocks as they are read from disk
Add an btree block verify callback function and pass it into the
buffer read functions. Because each different btree block type
requires different verification, add a function to the ops structure
that is called from the generic code.

Also, propagate the verification callback functions through the
readahead functions, and into the external bmap and bulkstat inode
readahead code that uses the generic btree buffer read functions.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Phil White <pwhite@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-11-15 21:34:31 -06:00
Dave Chinner
33479e0542 xfs: remove xfs_iget.c
The inode cache functions remaining in xfs_iget.c can be moved to xfs_icache.c
along with the other inode cache functions. This removes all functionality from
xfs_iget.c, so the file can simply be removed.

This move results in various functions now only having the scope of a single
file (e.g. xfs_inode_free()), so clean up all the definitions and exported
prototypes in xfs_icache.[ch] and xfs_inode.h appropriately.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-10-17 13:42:25 -05:00
Christoph Hellwig
475ee413f3 xfs: merge xfs_itobp into xfs_imap_to_bp
All callers of xfs_imap_to_bp want the dinode pointer, so let's calculate it
inside xfs_imap_to_bp.  Once that is done xfs_itobp becomes a fairly pointless
wrapper which can be replaced with direct calls to xfs_imap_to_bp.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-07-22 10:46:56 -05:00
Dave Chinner
ad1e95c54e xfs: clean up xfs_bit.h includes
With the removal of xfs_rw.h and other changes over time, xfs_bit.h
is being included in many files that don't actually need it. Clean
up the includes as necessary.

Also move the only-used-once xfs_ialloc_find_free() static inline
function out of a header file that is widely included to reduce
the number of needless dependencies on xfs_bit.h.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-05-14 16:21:00 -05:00
Dave Chinner
5132ba8f2b xfs: don't cache inodes read through bulkstat
When we read inodes via bulkstat, we generally only read them once
and then throw them away - they never get used again. If we retain
them in cache, then it simply causes the working set of inodes and
other cached items to be reclaimed just so the inode cache can grow.

Avoid this problem by marking inodes read by bulkstat not to be
cached and check this flag in .drop_inode to determine whether the
inode should be added to the VFS LRU or not. If the inode lookup
hits an already cached inode, then don't set the flag. If the inode
lookup hits an inode marked with no cache flag, remove the flag and
allow it to be cached once the current reference goes away.

Inodes marked as not cached will get cleaned up by the background
inode reclaim or via memory pressure, so they will still generate
some short term cache pressure. They will, however, be reclaimed
much sooner and in preference to cache hot inodes.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-26 17:19:08 -05:00
Christoph Hellwig
8a9c9980f2 xfs: log timestamp updates
Timestamps on regular files are the last metadata that XFS does not update
transactionally.  Now that we use the delaylog mode exclusively and made
the log scode scale extremly well there is no need to bypass that code for
timestamp updates.  Logging all updates allows to drop a lot of code, and
will allow for further performance improvements later on.

Note that this patch drops optimized handling of fdatasync - it will be
added back in a separate commit.

Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Ben Myers <bpm@sgi.com>
2012-03-13 17:01:15 -05:00
Christoph Hellwig
ecb697c16c xfs: fix variable set but not used warnings
GCC 4.6 now warnings about variables set but not used.  Fix the trivially
fixable warnings of this sort.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2011-04-08 08:09:12 -05:00
Arkadiusz Mi?kiewicz
6743099ce5 xfs: Extend project quotas to support 32bit project ids
This patch adds support for 32bit project quota identifiers.

On disk format is backward compatible with 16bit projid numbers. projid
on disk is now kept in two 16bit values - di_projid_lo (which holds the
same position as old 16bit projid value) and new di_projid_hi (takes
existing padding) and converts from/to 32bit value on the fly.

xfs_admin (for existing fs), mkfs.xfs (for new fs) needs to be used
to enable PROJID32BIT support.

Signed-off-by: Arkadiusz Miśkiewicz <arekm@maven.pl>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2010-10-18 15:08:08 -05:00
Christoph Hellwig
f2d6761433 xfs: remove xfs_iput
xfs_iput is just a small wrapper for xfs_iunlock + IRELE.  Having this
out of line wrapper means the trace events in those two can't track
their caller properly.  So just remove the wrapper and opencode the
unlock + rele in the few callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-07-26 13:16:44 -05:00
Christoph Hellwig
3400777ff0 xfs: remove unneeded #include statements
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2010-07-26 13:16:33 -05:00
Christoph Hellwig
288699feca xfs: drop dmapi hooks
Dmapi support was never merged upstream, but we still have a lot of hooks
bloating XFS for it, all over the fast pathes of the filesystem.

This patch drops over 700 lines of dmapi overhead.  If we'll ever get HSM
support in mainline at least the namespace events can be done much saner
in the VFS instead of the individual filesystem, so it's not like this
is much help for future work.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-07-26 13:16:33 -05:00
Dave Chinner
7b6259e7a8 xfs: remove block number from inode lookup code
The block number comes from bulkstat based inode lookups to shortcut
the mapping calculations. We ar enot able to trust anything from
bulkstat, so drop the block number as well so that the correct
lookups and mappings are always done.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-06-24 11:35:17 +10:00
Dave Chinner
1920779e67 xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED
Inode numbers may come from somewhere external to the filesystem
(e.g. file handles, bulkstat information) and so are inherently
untrusted. Rename the flag we use for these lookups to make it
obvious we are doing a lookup of an untrusted inode number and need
to verify it completely before trying to read it from disk.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2010-06-24 11:15:47 +10:00
Christoph Hellwig
7dce11dbac xfs: always use iget in bulkstat
The non-coherent bulkstat versionsthat look directly at the inode
buffers causes various problems with performance optimizations that
make increased use of just logging inodes.  This patch makes bulkstat
always use iget, which should be fast enough for normal use with the
radix-tree based inode cache introduced a while ago.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
2010-06-23 18:11:11 +10:00
Dave Chinner
07000ee686 xfs: return inode fork offset in bulkstat for fsr
So that fsr can attempt to get the fork offset of the temporary
inode it uses the same as the inode it is defragmenting, pass the
fork offset out in the bulkstat information.

The bulkstat structure has padding that has always been zeroed, so
userspace can tell if this field is set or not by use of the xattr
present flag and a non-zero value for the fork offset.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2010-03-05 11:02:07 -06:00
Christoph Hellwig
bdfb04301f xfs: replace KM_LARGE with explicit vmalloc use
We use the KM_LARGE flag to make kmem_alloc and friends use vmalloc
if necessary.  As we only need this for a few boot/mount time
allocations just switch to explicit vmalloc calls there.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2010-01-21 13:44:56 -06:00
Dave Chinner
1c1c6ebcf5 xfs: Replace per-ag array with a radix tree
The use of an array for the per-ag structures requires reallocation
of the array when growing the filesystem. This requires locking
access to the array to avoid use after free situations, and the
locking is difficult to get right. To avoid needing to reallocate an
array, change the per-ag structures to an allocated object per ag
and index them using a tree structure.

The AGs are always densely indexed (hence the use of an array), but
the number supported is 2^32 and lookups tend to be random and hence
indexing needs to scale. A simple choice is a radix tree - it works
well with this sort of index.  This change also removes another
large contiguous allocation from the mount/growfs path in XFS.

The growing process now needs to change to only initialise the new
AGs required for the extra space, and as such only needs to
exclusively lock the tree for inserts. The rest of the code only
needs to lock the tree while doing lookups, and hence this will
remove all the deadlocks that currently occur on the m_perag_lock as
it is now an innermost lock. The lock is also changed to a spinlock
from a read/write lock as the hold time is now extremely short.

To complete the picture, the per-ag structures will need to be
reference counted to ensure that we don't free/modify them while
they are still in use.  This will be done in subsequent patch.

Signed-off-by: Dave Chinner <david@fromorbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Alex Elder <aelder@sgi.com>
2010-01-15 15:33:52 -06:00
Christoph Hellwig
f9581b1443 xfs: implement ->dirty_inode to fix timestamp handling
This is picking up on Felix's repost of Dave's patch to implement a
.dirty_inode method.  We really need this notification because
the VFS keeps writing directly into the inode structure instead
of going through methods to update this state.  In addition to
the long-known atime issue we now also have a caller in VM code
that updates c/mtime that way for shared writeable mmaps.  And
I found another one that no one has noticed in practice in the FIFO
code.

So implement ->dirty_inode to set i_update_core whenever the
inode gets externally dirtied, and switch the c/mtime handling to
the same scheme we already use for atime (always picking up
the value from the Linux inode).

Note that this patch also removes the xfs_synchronize_atime call
in xfs_reclaim it was superflous as we already synchronize the time
when writing the inode via the log (xfs_inode_item_format) or the
normal buffers (xfs_iflush_int).

In addition also remove the I_CLEAR check before copying the Linux
timestamps - now that we always have the Linux inode available
we can always use the timestamps in it.

Also switch to just using file_update_time for regular reads/writes -
that will get us all optimization done to it for free and make
sure we notice early when it breaks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Felix Blyakher <felixb@sgi.com>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Alex Elder <aelder@sgi.com>
2009-10-08 12:00:03 -05:00
Christoph Hellwig
2187550525 xfs: rationalize xfs_inobt_lookup*
Currenly we have a xfs_inobt_lookup* variant for each comparism direction,
and all these get all three fields of the inobt records passed, while the
common case is just looking for the inode number and we have only marginally
more callers than xfs_inobt_lookup* variants.

So opencode a direct call to xfs_btree_lookup for the single case where we
need all fields, and replace xfs_inobt_lookup* with a xfs_inobt_looku that
just takes the inode number and the direction for all other callers.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01 12:45:39 -05:00
Christoph Hellwig
2e287a731e xfs: improve xfs_inobt_get_rec prototype
Most callers of xfs_inobt_get_rec need to fill a xfs_inobt_rec_incore_t, and
those who don't yet are fine with a xfs_inobt_rec_incore_t, instead of the
three individual variables, too.  So just change xfs_inobt_get_rec to write
the output into a xfs_inobt_rec_incore_t directly.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Alex Elder <aelder@sgi.com>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-09-01 12:44:56 -05:00
Eric Sandeen
d96f8f891f xfs: add more statics & drop some unused functions
A lot more functions could be made static, but they need
forward declarations; this does some easy ones, and also
found a few unused functions in the process.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Felix Blyakher <felixb@sgi.com>
2009-08-31 14:46:20 -05:00
Malcolm Parsons
9da096fd13 xfs: fix various typos
Signed-off-by: Malcolm Parsons <malcolm.parsons@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
2009-03-29 09:55:42 +02:00
Christoph Hellwig
8fab451e3c xfs: kill vn_atime_* helpers.
Two out of three are unused already, and the third is better done open-coded
with a comment describing what's going on here.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
2009-03-16 08:24:46 +01:00
Eric Sandeen
9d87c3192d [XFS] Remove the rest of the macro-to-function indirections.
Remove the last of the macros-defined-to-static-functions.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2009-01-16 17:10:42 +11:00
sandeen@sandeen.net
65fbaf2489 [XFS] Fix xfs_bulkstat_one size checks & error handling
The 32-bit xfs_blkstat_one handler was failing because
a size check checked whether the remaining (32-bit)
user buffer was less than the (64-bit) bulkstat buffer,
and failed with ENOMEM if so.  Move this check
into the respective handlers so that they check the
correct sizes.

Also, the formatters were returning negative errors
or positive bytes copied; this was odd in the positive
error value world of xfs, and handled wrong by at least
some of the callers, which treated the bytes returned
as an error value.  Move the bytes-used assignment
into the formatters.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:16:03 +11:00
sandeen@sandeen.net
2ee4fa5cb7 [XFS] Make the bulkstat_one compat ioctl handling more sane
Currently the compat formatter was handled by passing
in "private_data" for the xfs_bulkstat_one formatter,
which was really just another formatter... IMHO this
got confusing.

Instead, just make a new xfs_bulkstat_one_compat
formatter for xfs_bulkstat, and call it via a wrapper.

Also, don't translate the ioctl nrs into their native
counterparts, that just clouds the issue; we're in a
compat handler anyway, just switch on the 32-bit cmds.

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-12-02 17:15:36 +11:00
Christoph Hellwig
b48d8d6437 [XFS] kill the XFS_IMAP_BULKSTAT flag
Just pass down the XFS_IGET_* flags all the way down to xfs_imap instead
of translating them mid-way.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:13 +11:00
Christoph Hellwig
92bfc6e7c4 [XFS] embededd struct xfs_imap into xfs_inode
Most uses of struct xfs_imap are to map and inode to a buffer.  To avoid
copying around the inode location information we should just embedd a
strcut xfs_imap into the xfs_inode.  To make sure it doesn't bloat an
inode the im_len is changed to a ushort, which is fine as that's what
the users exepect anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:38:08 +11:00
Christoph Hellwig
51ce16d519 [XFS] kill XFS_DINODE_VERSION_ defines
These names don't add any value at all over just using the numerical
values.

(First sent on October 9th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:42 +11:00
Christoph Hellwig
81591fe2db [XFS] kill xfs_dinode_core_t
Now that we have a separate xfs_icdinode_t for the in-core inode which
gets logged there is no need anymore for the xfs_dinode vs xfs_dinode_core
split - the fact that part of the structure gets logged through the inode
log item and a small part not can better be described in a comment.

All sizeof operations on the dinode_core either really wanted the
icdinode and are switched to that one, or had already added the size
of the agi unlinked list pointer.  Later both will be replaced with
helpers once we get the larger CRC-enabled dinode.

Removing the data and attribute fork unions also has the advantage that
xfs_dinode.h doesn't need to pull in every header under the sun.

While we're at it also add some more comments describing the dinode
structure.

(First sent on October 7th)

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
2008-12-01 11:37:35 +11:00
Christoph Hellwig
c679eef052 [XFS] stop using xfs_itobp in xfs_bulkstat
xfs_bulkstat only wants the dinode, offset and buffer from a given inode
number. Instead of using xfs_itobp on a fake inode which is complicated
and currently leads to leaks of the security data just use xfs_inotobp
which is designed to do exactly the kind of lookup xfs_bulkstat wants. The
only thing that's missing in xfs_inotobp is a flags paramter that let's us
pass down XFS_IMAP_BULKSTAT, but that can easily added.

SGI-PV: 987246

SGI-Modid: xfs-linux-melb:xfs-kern:32397a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 18:04:13 +11:00
Christoph Hellwig
637aa50f46 [XFS] implement generic xfs_btree_increment
From: Dave Chinner <dgc@sgi.com>

Because this is the first major generic btree routine this patch includes
some infrastrucure, first a few routines to deal with a btree block that
can be either in short or long form, second xfs_btree_read_buf_block,
which is the new central routine to read a btree block given a cursor, and
third the new xfs_btree_ptr_addr routine to calculate the address for a
given btree pointer record.

[hch: split out from bigger patch and minor adaptions]

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32190a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:55:45 +11:00
Christoph Hellwig
561f7d1739 [XFS] split up xfs_btree_init_cursor
xfs_btree_init_cursor contains close to little shared code for the
different btrees and will get even more non-common code in the future.
Split it up into one routine per btree type.

Because xfs_btree_dup_cursor needs to call the init routine for a generic
btree cursor add a new btree operation vector that contains a dup_cursor
method that initializes a new cursor based on an existing one.

The btree operations vector is based on an idea and code from Dave Chinner
and will grow more entries later during this series.

SGI-PV: 985583

SGI-Modid: xfs-linux-melb:xfs-kern:32176a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Bill O'Donnell <billodo@sgi.com>
Signed-off-by: David Chinner <david@fromorbit.com>
2008-10-30 16:53:59 +11:00
David Chinner
07c8f67587 [XFS] Make use of the init-once slab optimisation.
To avoid having to initialise some fields of the XFS inode on every
allocation, we can use the slab init-once feature to initialise them. All
we have to guarantee is that when we free the inode, all it's entries are
in the initial state. Add asserts where possible to ensure debug kernels
check this initial state before freeing and after allocation.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31925a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
2008-10-30 16:11:59 +11:00
Christoph Hellwig
df80c933f9 [XFS] remove some easy bhv_vnode_t instances
In various places we can just move a VFS_I call into the argument list of
called functions/macros instead of having a local bhv_vnode_t.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31776a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:22:09 +10:00
David Chinner
e4f7529108 [XFS] Kill shouty XFS_ITOV() macro
Replace XFS_ITOV() with the new VFS_I() inline.

SGI-PV: 981498

SGI-Modid: xfs-linux-melb:xfs-kern:31724a

Signed-off-by: David Chinner <david@fromorbit.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-08-13 16:00:45 +10:00
Denys Vlasenko
f0e2d93c29 [XFS] Remove unused arg from kmem_free()
kmem_free() function takes (ptr, size) arguments but doesn't actually use
second one.

This patch removes size argument from all callsites.

SGI-PV: 981498
SGI-Modid: xfs-linux-melb:xfs-kern:31050a

Signed-off-by: Denys Vlasenko <vda.linux@googlemail.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-07-28 16:58:07 +10:00
Christoph Hellwig
6a7f422d47 [XFS] kill di_mode checks after xfs_iget
Unless XFS_IGET_CREATE is passed xfs_iget will return ENOENT if it
encounters an inode with di_mode == 0. Remove the duplicated checks in the
callers.

(the log recovery case is not touched for now)

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30898a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-29 15:53:31 +10:00
David Chinner
7b07339048 [XFS] xfs_bulkstat_one_dinode() never returns an error.
Mark it void.

SGI-PV: 980084
SGI-Modid: xfs-linux-melb:xfs-kern:30828a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Niv Sardi <xaiki@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 12:01:27 +10:00
David Chinner
a3f74ffb6d [XFS] Don't block pdflush when writing back inodes
When pdflush is writing back inodes, it can get stuck on inode cluster
buffers that are currently under I/O. This occurs when we write data to
multiple inodes in the same inode cluster at the same time.

Effectively, delayed allocation marks the inode dirty during the data
writeback. Hence if the inode cluster was flushed during the writeback of
the first inode, the writeback of the second inode will block waiting for
the inode cluster write to complete before writing it again for the newly
dirtied inode.

Basically, we want to avoid this from happening so we don't block pdflush
and slow down all of writeback. Hence we introduce a non-blocking async
inode flush flag that pdflush uses. If this flag is set, we use
non-blocking operations (e.g. try locks) whereever we can to avoid
blocking or extra I/O being issued.

SGI-PV: 970925
SGI-Modid: xfs-linux-melb:xfs-kern:30501a

Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-18 11:37:32 +10:00
Eric Sandeen
6211870992 [XFS] remove shouting-indirection macros from xfs_sb.h
Remove macro-to-small-function indirection from xfs_sb.h, and remove some
which are completely unused.

SGI-PV: 976035
SGI-Modid: xfs-linux-melb:xfs-kern:30528a

Signed-off-by: Eric Sandeen <sandeen@sandeen.net>
Signed-off-by: Donald Douwsma <donaldd@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-04-10 16:24:45 +10:00
Christoph Hellwig
45ba598e56 [XFS] Remove CFORK macros and use code directly in IFORK and DFORK macros.
Currently XFS_IFORK_* and XFS_DFORK* are implemented by means of
XFS_CFORK* macros. But given that XFS_IFORK_* operates on an xfs_inode
that embedds and xfs_icdinode_core and XFS_DFORK_* operates on an
xfs_dinode that embedds a xfs_dinode_core one will have to do endian
swapping while the other doesn't. Instead of having the current mess with
the CFORK macros that have byteswapping and non-byteswapping version
(which are inconsistantly named while we're at it) just define each family
of the macros to stand by itself and simplify the whole matter.

A few direct references to the CFORK variants were cleaned up to use IFORK
or DFORK to make this possible.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30163a

Signed-off-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:19:24 +11:00
Tim Shimmin
e6a4b37f38 [XFS] Remove the BPCSHIFT and NB* based macros from XFS.
The BPCSHIFT based macros, btoc*, ctob*, offtoc* and ctooff are either not
used or don't need to be used. The NDPP, NDPP, NBBY macros don't need to
be used but instead are replaced directly by PAGE_SIZE and PAGE_CACHE_SIZE
where appropriate. Initial patch and motivation from Nicolas Kaiser.

SGI-PV: 971186
SGI-Modid: xfs-linux-melb:xfs-kern:30096a

Signed-off-by: Tim Shimmin <tes@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:17:58 +11:00
Vlad Apostolov
c319b58b13 [XFS] Make xfs_bulkstat() to report unlinked but referenced inodes
We need xfs_bulkstat() to report inode stat for inodes with link count
zero but reference count non zero.

The fix here:

http://oss.sgi.com/archives/xfs/2007-09/msg00266.html

changed this behavior and made xfs_bulkstat() to filter all unlinked
inodes including those that are not destroyed yet but held by reference.

The attached patch returns back to the original behavior by marking the
on-disk inode buffer "dirty" when di_mode is cleared (at that time both
inode link and reference counter are zero).

SGI-PV: 972004
SGI-Modid: xfs-linux-melb:xfs-kern:29914a

Signed-off-by: Vlad Apostolov <vapo@sgi.com>
Signed-off-by: David Chinner <dgc@sgi.com>
Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
2008-02-07 18:13:37 +11:00