Commit Graph

59 Commits

Author SHA1 Message Date
Steven Rostedt (Google)
d508ee2dd5 tracefs/eventfs: Free top level files on removal
When an instance is removed, the top level files of the eventfs directory
are not cleaned up. Call the eventfs_remove() on each of the entries to
free them.

This was found via kmemleak:

unreferenced object 0xffff8881047c1280 (size 96):
  comm "mkdir", pid 924, jiffies 4294906489 (age 2013.077s)
  hex dump (first 32 bytes):
    18 31 ed 03 81 88 ff ff 00 31 09 24 81 88 ff ff  .1.......1.$....
    00 00 00 00 00 00 00 00 98 19 7c 04 81 88 ff ff  ..........|.....
  backtrace:
    [<000000000fa46b4d>] kmalloc_trace+0x2a/0xa0
    [<00000000e729cd0c>] eventfs_prepare_ef.constprop.0+0x3a/0x160
    [<000000009032e6a8>] eventfs_add_events_file+0xa0/0x160
    [<00000000fe968442>] create_event_toplevel_files+0x6f/0x130
    [<00000000e364d173>] event_trace_add_tracer+0x14/0x140
    [<00000000411840fa>] trace_array_create_dir+0x52/0xf0
    [<00000000967804fa>] trace_array_create+0x208/0x370
    [<00000000da505565>] instance_mkdir+0x6b/0xb0
    [<00000000dc1215af>] tracefs_syscall_mkdir+0x5b/0x90
    [<00000000a8aca289>] vfs_mkdir+0x272/0x380
    [<000000007709b242>] do_mkdirat+0xfc/0x1d0
    [<00000000c0b6d219>] __x64_sys_mkdir+0x78/0xa0
    [<0000000097b5dd4b>] do_syscall_64+0x3f/0x90
    [<00000000a3f00cfa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8
unreferenced object 0xffff888103ed3118 (size 8):
  comm "mkdir", pid 924, jiffies 4294906489 (age 2013.077s)
  hex dump (first 8 bytes):
    65 6e 61 62 6c 65 00 00                          enable..
  backtrace:
    [<0000000010f75127>] __kmalloc_node_track_caller+0x51/0x160
    [<000000004b3eca91>] kstrdup+0x34/0x60
    [<0000000050074d7a>] eventfs_prepare_ef.constprop.0+0x53/0x160
    [<000000009032e6a8>] eventfs_add_events_file+0xa0/0x160
    [<00000000fe968442>] create_event_toplevel_files+0x6f/0x130
    [<00000000e364d173>] event_trace_add_tracer+0x14/0x140
    [<00000000411840fa>] trace_array_create_dir+0x52/0xf0
    [<00000000967804fa>] trace_array_create+0x208/0x370
    [<00000000da505565>] instance_mkdir+0x6b/0xb0
    [<00000000dc1215af>] tracefs_syscall_mkdir+0x5b/0x90
    [<00000000a8aca289>] vfs_mkdir+0x272/0x380
    [<000000007709b242>] do_mkdirat+0xfc/0x1d0
    [<00000000c0b6d219>] __x64_sys_mkdir+0x78/0xa0
    [<0000000097b5dd4b>] do_syscall_64+0x3f/0x90
    [<00000000a3f00cfa>] entry_SYSCALL_64_after_hwframe+0x6e/0xd8

Link: https://lore.kernel.org/linux-trace-kernel/20230907175859.6fedbaa2@gandalf.local.home

Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Ajay Kaher <akaher@vmware.com>
Cc: Zheng Yejian <zhengyejian1@huawei.com>
Cc: Naresh Kamboju <naresh.kamboju@linaro.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Fixes: 5bdcd5f533 eventfs: ("Implement removal of meta data from eventfs")
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-08 23:12:06 -04:00
Steven Rostedt (Google)
9879e5e1c5 tracefs/eventfs: Use dput to free the toplevel events directory
Currently when rmdir on an instance is done, eventfs_remove_events_dir()
is called and it does a dput on the dentry and then frees the
eventfs_inode that represents the events directory.

But there's no protection against a reader reading the top level events
directory at the same time and we can get a use after free error. Instead,
use the dput() associated to the dentry to also free the eventfs_inode
associated to the events directory, as that will get called when the last
reference to the directory is released.

This issue triggered the following KASAN report:

 ==================================================================
 BUG: KASAN: slab-use-after-free in eventfs_root_lookup+0x88/0x1b0
 Read of size 8 at addr ffff888120130ca0 by task ftracetest/1201

 CPU: 4 PID: 1201 Comm: ftracetest Not tainted 6.5.0-test-10737-g469e0a8194e7 #13
 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
 Call Trace:
  <TASK>
  dump_stack_lvl+0x57/0x90
  print_report+0xcf/0x670
  ? __pfx_ring_buffer_record_off+0x10/0x10
  ? _raw_spin_lock_irqsave+0x2b/0x70
  ? __virt_addr_valid+0xd9/0x160
  kasan_report+0xd4/0x110
  ? eventfs_root_lookup+0x88/0x1b0
  ? eventfs_root_lookup+0x88/0x1b0
  eventfs_root_lookup+0x88/0x1b0
  ? eventfs_root_lookup+0x33/0x1b0
  __lookup_slow+0x194/0x2a0
  ? __pfx___lookup_slow+0x10/0x10
  ? down_read+0x11c/0x330
  walk_component+0x166/0x220
  link_path_walk.part.0.constprop.0+0x3a3/0x5a0
  ? seqcount_lockdep_reader_access+0x82/0x90
  ? __pfx_link_path_walk.part.0.constprop.0+0x10/0x10
  path_openat+0x143/0x11f0
  ? __lock_acquire+0xa1a/0x3220
  ? __pfx_path_openat+0x10/0x10
  ? __pfx___lock_acquire+0x10/0x10
  do_filp_open+0x166/0x290
  ? __pfx_do_filp_open+0x10/0x10
  ? lock_is_held_type+0xce/0x120
  ? preempt_count_sub+0xb7/0x100
  ? _raw_spin_unlock+0x29/0x50
  ? alloc_fd+0x1a0/0x320
  do_sys_openat2+0x126/0x160
  ? rcu_is_watching+0x34/0x60
  ? __pfx_do_sys_openat2+0x10/0x10
  ? __might_resched+0x2cf/0x3b0
  ? __fget_light+0xdf/0x100
  __x64_sys_openat+0xcd/0x140
  ? __pfx___x64_sys_openat+0x10/0x10
  ? syscall_enter_from_user_mode+0x22/0x90
  ? lockdep_hardirqs_on+0x7d/0x100
  do_syscall_64+0x3b/0xc0
  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
 RIP: 0033:0x7f1dceef5e51
 Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 9a 27 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 93 00 00 00 48 8b 54 24 28 64 48 2b 14 25
 RSP: 002b:00007fff2cddf380 EFLAGS: 00000202 ORIG_RAX: 0000000000000101
 RAX: ffffffffffffffda RBX: 0000000000000241 RCX: 00007f1dceef5e51
 RDX: 0000000000000241 RSI: 000055d7520677d0 RDI: 00000000ffffff9c
 RBP: 000055d7520677d0 R08: 000000000000001e R09: 0000000000000001
 R10: 00000000000001b6 R11: 0000000000000202 R12: 0000000000000000
 R13: 0000000000000003 R14: 000055d752035678 R15: 000055d752067788
  </TASK>

 Allocated by task 1200:
  kasan_save_stack+0x2f/0x50
  kasan_set_track+0x21/0x30
  __kasan_kmalloc+0x8b/0x90
  eventfs_create_events_dir+0x54/0x220
  create_event_toplevel_files+0x42/0x130
  event_trace_add_tracer+0x33/0x180
  trace_array_create_dir+0x52/0xf0
  trace_array_create+0x361/0x410
  instance_mkdir+0x6b/0xb0
  tracefs_syscall_mkdir+0x57/0x80
  vfs_mkdir+0x275/0x380
  do_mkdirat+0x1da/0x210
  __x64_sys_mkdir+0x74/0xa0
  do_syscall_64+0x3b/0xc0
  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

 Freed by task 1251:
  kasan_save_stack+0x2f/0x50
  kasan_set_track+0x21/0x30
  kasan_save_free_info+0x27/0x40
  __kasan_slab_free+0x106/0x180
  __kmem_cache_free+0x149/0x2e0
  event_trace_del_tracer+0xcb/0x120
  __remove_instance+0x16a/0x340
  instance_rmdir+0x77/0xa0
  tracefs_syscall_rmdir+0x77/0xc0
  vfs_rmdir+0xed/0x2d0
  do_rmdir+0x235/0x280
  __x64_sys_rmdir+0x5f/0x90
  do_syscall_64+0x3b/0xc0
  entry_SYSCALL_64_after_hwframe+0x6e/0xd8

 The buggy address belongs to the object at ffff888120130ca0
  which belongs to the cache kmalloc-16 of size 16
 The buggy address is located 0 bytes inside of
  freed 16-byte region [ffff888120130ca0, ffff888120130cb0)

 The buggy address belongs to the physical page:
 page:000000004dbddbb0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120130
 flags: 0x17ffffc0000800(slab|node=0|zone=2|lastcpupid=0x1fffff)
 page_type: 0xffffffff()
 raw: 0017ffffc0000800 ffff8881000423c0 dead000000000122 0000000000000000
 raw: 0000000000000000 0000000000800080 00000001ffffffff 0000000000000000
 page dumped because: kasan: bad access detected

 Memory state around the buggy address:
  ffff888120130b80: 00 00 fc fc 00 05 fc fc 00 00 fc fc 00 02 fc fc
  ffff888120130c00: 00 07 fc fc 00 00 fc fc 00 00 fc fc fa fb fc fc
 >ffff888120130c80: 00 00 fc fc fa fb fc fc 00 00 fc fc 00 00 fc fc
                                ^
  ffff888120130d00: 00 00 fc fc 00 00 fc fc 00 00 fc fc fa fb fc fc
  ffff888120130d80: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc
 ==================================================================

Link: https://lkml.kernel.org/r/20230907024803.250873643@goodmis.org
Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/

Cc: Ajay Kaher <akaher@vmware.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Fixes: 5bdcd5f533 eventfs: ("Implement removal of meta data from eventfs")
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Reported-by: Zheng Yejian <zhengyejian1@huawei.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-07 16:04:47 -04:00
Steven Rostedt (Google)
e24709454c tracefs/eventfs: Add missing lockdown checks
All the eventfs external functions do not check if TRACEFS_LOCKDOWN was
set or not. This can caused some functions to return success while others
fail, which can trigger unexpected errors.

Add the missing lockdown checks.

Link: https://lkml.kernel.org/r/20230905182711.899724045@goodmis.org
Link: https://lore.kernel.org/all/202309050916.58201dc6-oliver.sang@intel.com/

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Ajay Kaher <akaher@vmware.com>
Cc: Ching-lin Yu <chinglinyu@google.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-09-05 21:14:08 -04:00
Steven Rostedt (Google)
8c96b70171 tracefs: Remove kerneldoc from struct eventfs_file
The struct eventfs_file is a local structure and should not be parsed by
kernel doc. It also does not fully follow the kerneldoc format and is
causing kerneldoc to spit out errors. Replace the /** to /* so that
kerneldoc no longer processes this structure.

Also format the comments of the delete union of the structure to be a bit
better.

Link: https://lore.kernel.org/linux-trace-kernel/20230818201414.2729745-1-willy@infradead.org/
Link: https://lore.kernel.org/linux-trace-kernel/20230822053313.77aa3397@rorschach.local.home

Cc: Mark Rutland <mark.rutland@arm.com>
Reported-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-08-22 09:05:24 -04:00
Ajay Kaher
5bdcd5f533 eventfs: Implement removal of meta data from eventfs
When events are removed from tracefs, the eventfs must be aware of this.
The eventfs_remove() removes the meta data from eventfs so that it will no
longer create the files associated with that event.

When an instance is removed from tracefs, eventfs_remove_events_dir() will
remove and clean up the entire "events" directory.

The helper function eventfs_remove_rec() is used to clean up and free the
associated data from eventfs for both of the added functions. SRCU is used
to protect the lists of meta data stored in the eventfs. The eventfs_mutex
is used to protect the content of the items in the list.

As lookups may be happening as deletions of events are made, the freeing
of dentry/inodes and relative information is done after the SRCU grace
period has passed.

Link: https://lkml.kernel.org/r/1690568452-46553-9-git-send-email-akaher@vmware.com

Signed-off-by: Ajay Kaher <akaher@vmware.com>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ching-lin Yu <chinglinyu@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202305030611.Kas747Ev-lkp@intel.com/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-30 18:13:34 -04:00
Ajay Kaher
a376007917 eventfs: Implement functions to create files and dirs when accessed
Add create_file() and create_dir() functions to create the files and
directories respectively when they are accessed. The functions will be
called from the lookup operation of the inode_operations or from the open
function of file_operations.

Link: https://lkml.kernel.org/r/1690568452-46553-8-git-send-email-akaher@vmware.com

Signed-off-by: Ajay Kaher <akaher@vmware.com>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ching-lin Yu <chinglinyu@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-30 18:13:34 -04:00
Ajay Kaher
6394044955 eventfs: Implement eventfs lookup, read, open functions
Add the inode_operations, file_operations, and helper functions to eventfs:
dcache_dir_open_wrapper()
eventfs_root_lookup()
eventfs_release()
eventfs_set_ef_status_free()
eventfs_post_create_dir()

The inode_operations and file_operations functions will be called from the
VFS layer.

create_file() and create_dir() are added as stub functions and will be
filled in later.

Link: https://lkml.kernel.org/r/1690568452-46553-7-git-send-email-akaher@vmware.com

Signed-off-by: Ajay Kaher <akaher@vmware.com>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ching-lin Yu <chinglinyu@google.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-30 18:13:34 -04:00
Ajay Kaher
88f349b4a8 eventfs: Implement eventfs file add functions
Add the following functions to add files to evenfs:

eventfs_add_events_file() to add the data needed to create a specific file
located at the top level events directory. The dentry/inode will be
created when the events directory is scanned.

eventfs_add_file() to add the data needed for files within the directories
below the top level events directory. The dentry/inode of the file will be
created when the directory that the file is in is scanned.

Link: https://lkml.kernel.org/r/1690568452-46553-6-git-send-email-akaher@vmware.com

Signed-off-by: Ajay Kaher <akaher@vmware.com>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ching-lin Yu <chinglinyu@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202305051619.9a469a9a-yujie.liu@intel.com
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-30 18:13:33 -04:00
Ajay Kaher
c1504e5102 eventfs: Implement eventfs dir creation functions
Add eventfs_file structure which will hold the properties of the eventfs
files and directories.

Add following functions to create the directories in eventfs:

eventfs_create_events_dir() will create the top level "events" directory
within the tracefs file system.

eventfs_add_subsystem_dir() creates an eventfs_file descriptor with the
given name of the subsystem.

eventfs_add_dir() creates an eventfs_file descriptor with the given name of
the directory and attached to a eventfs_file of a subsystem.

Add tracefs_inode structure to hold the inodes, flags and pointers to
private data used by eventfs.

Link: https://lkml.kernel.org/r/1690568452-46553-5-git-send-email-akaher@vmware.com

Signed-off-by: Ajay Kaher <akaher@vmware.com>
Co-developed-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Ching-lin Yu <chinglinyu@google.com>
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202305051619.9a469a9a-yujie.liu@intel.com
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2023-07-30 18:13:33 -04:00