1 #+TITLE: The New Samba VFS
2 #+AUTHOR: Ralph Böhme, SerNet, Samba Team
3 #+DATE: {{{modification-time(%Y-%m-%d)}}}
6 The effort to modernize Samba's VFS interface has reached a major milestone with
7 the next release Samba 4.14.
9 Starting with version 4.14 Samba provides core infrastructure code that allows
10 basing all access to the server's filesystem on file handles and not on
11 paths. An example of this is using =fstat()= instead of =stat()=, or
12 =SMB_VFS_FSTAT()= instead of =SMB_VFS_STAT()= in Samba parlance.
14 Historically Samba's fileserver code had to deal a lot with processing path
15 based SMB requests. While the SMB protocol itself has been streamlined to be
16 purely handle based starting with SMB2, large parts of infrastructure code
17 remains in place that will "degrade" handle based SMB2 requests to path based
20 In order to fully leverage the handle based nature of the SMB2 protocol we came
21 up with a straight forward way to convert this infrastructure code.
23 At the core, we introduced a helper function that opens a file handle that only
24 serves as a path reference and hence can not be used for any sort of access to
27 Samba's internal file handle structure is of type =struct files_struct= and all
28 variable pointing to objects of such type are typically called =fsp=. Until very
29 recently the only function that would open such a file handle and return an fsp
30 was =SMB_VFS_CREATE_FILE()=.
32 Internally =SMB_VFS_CREATE_FILE()= consisted of processing through Samba's VFS
33 open function to open the low level file and then going through Samba's Windows
36 The key point of the new helper function which is called =openat_pathref_fsp()=
37 is that it skips the NTFS emulation logic. Additionally, the handle is
38 restricted internally to be only usable as a path reference but not for any sort
39 of IO. On Linux this is achieved by using the =O_PATH= =open()= flag, on systems
40 without =O_PATH= support other mechanisms are used described in more detail
43 Path processing in Samba typically means processing client supplied paths by
44 Samba's core path processing function =filename_convert()= which returns a
45 pointer to an object of type =struct smb_filename=. Pointers to such objects are
46 then passed around, often passing many layers of code.
48 By attaching an =fsp= file handle returned from =openat_pathref_fsp()= to all
49 =struct smb_filename= objects returned from =filename_convert()=, the whole
50 infrastructure code has immediate access to a file handle and so the large
51 infrastructure codebase can be converted to use handle based VFS functions
52 whenever VFS access is done in a piecemeal fashion.
55 On Linux the =O_PATH= flag to =open()= can be used to open a filehandle on a
56 file or directory with interesting properties: [fn:manpage]
58 - the file-handle indicates a location in the filesystem tree,
60 - no permission checks are done by the kernel on the filesystem object and
62 - only operations that act purely at the file descriptor level are allowed.
64 The file itself is not opened, and other file operations (e.g., ~read(2)~,
65 ~write(2)~, ~fchmod(2)~, ~fchown(2)~, ~fgetxattr(2)~, ~ioctl(2)~, ~mmap(2)~) fail
66 with the error ~EBADF~.
68 The following subset of operations that is relevant to Samba is allowed:
72 - ~fchdir(2)~, if the file descriptor refers to a directory,
78 - passing the file descriptor as the dirfd argument of ~openat()~ and the other
79 "*at()" system calls. This includes ~linkat(2)~ with AT_EMPTY_PATH (or via
80 procfs using AT_SYMLINK_FOLLOW) even if the file is not a directory.
82 Opening a file or directory with the ~O_PATH~ flag requires no permissions
83 on the object itself (but does require execute permission on the
84 directories in the path prefix). By contrast, obtaining a reference to a
85 filesystem object by opening it with the ~O_RDONLY~ flag requires that the
86 caller have read permission on the object, even when the subsequent
87 operation (e.g., ~fchdir(2)~, ~fstat(2)~) does not require read permis‐
90 If for example Samba receives an SMB request to open a file requesting
91 ~SEC_FILE_READ_ATTRIBUTE~ access rights because the client wants to read the
92 file's metadata from the handle, Samba will have to call ~open()~ with at least
93 ~O_RDONLY~ access rights.
94 *** Usecases for O_PATH in Samba
95 The ~O_PATH~ flag is currently not used in Samba. By leveraging this Linux
96 specific flags we can avoid permission mismatches as described above.
98 Additionally ~O_PATH~ allows basing all filesystem accesses done by the
99 fileserver on handle based syscalls by opening all client pathnames with
100 ~O_PATH~ and consistently using for example ~fstat()~ instead of ~stat()~
101 throughout the codebase.
103 Subsequent parts of this document will call such file-handles opened with O_PATH
104 *path referencing file-handles* or *pathref*s for short.
106 *** When to open with O_PATH
107 In Samba the decision whether to call POSIX ~open()~ on a client pathname or
108 whether to leave the low-level handle at -1 (what we call a stat-open) is based
109 on the client requested SMB access mask.
111 The set of access rights that trigger an ~open()~ includes
112 ~READ_CONTROL_ACCESS~. As a result, the open() will be done with at least
113 ~O_RDONLY~. If the filesystem supports NT style ACLs natively (like GPFS or ZFS),
114 the filesystem may grant the user requested right ~READ_CONTROL_ACCESS~, but it
115 may not grant ~READ_DATA~ (~O_RDONLY~).
117 Currently the full set of access rights that trigger opening a file is:
125 - SEC_FLAG_SYSTEM_SECURITY
126 - READ_CONTROL_ACCESS
128 In the future we can remove the following rights from the list on systems that
133 - SEC_FLAG_SYSTEM_SECURITY
134 - READ_CONTROL_ACCESS
135 *** Fallback on systems without O_PATH support
136 The code of higher level file-handle consumers must be kept simple and
137 streamlined, avoiding special casing the handling of the file-handles opened
138 with or without ~O_PATH~. To achieve this, a fallback that allows opening a
139 file-handle with the same higher level semantics even if the system doesn't
140 support ~O_PATH~ is needed.
142 The way this is implemented on such systems is impersonating the root user for
143 the ~open()~ syscall. In order to avoid privilege escalations security issues,
144 we must carefully control the use these file-handles.
146 The low level filehandle is stored in a public struct ~struct file_handle~ that
147 is part of the widely used ~struct files_struct~. Consumers used to simply
148 access the fd directly by dereferencing pointers to ~struct files_struct~.
150 In order to guard access to such file-handles we do two things:
152 - tag the pathref file-handles and
154 - control access to the file-handle by making the structure ~struct
155 file_handle~ private, only allowing access with accessor functions that
156 implement a security boundary.
158 In order to avoid bypassing restrictive permissions on intermediate directories
159 of a client path, the root user is only impersonated after changing directory
160 to the parent directory of the client requested pathname.
162 Two functions can then be used to fetch the low-level system file-handle from a
163 ~struct files_struct~:
165 - ~fsp_get_io_fd(fsp)~: enforces fsp is NOT a pathref file-handle and
167 - ~fsp_get_pathref_fd(fsp)~: allows fsp to be either a pathref file-handle or a
168 traditional POSIX file-handle opened with O_RDONLY or any other POSIX open
171 Note that the name ~fsp_get_pathref_fd()~ may sound confusing at first given
172 that the fsp can be either a pathref fsp or a "normal/full" fsp, but as any
173 full file-handle can be used for IO and as path reference, the name
174 correctly reflects the intended usage of the caller.
175 *** When to use fsp_get_io_fd() or fsp_get_pathref_fd()
177 The general guideline is:
179 - if you do something like ~fstat(fd)~, use ~fsp_get_pathref_fd()~,
181 - if you do something like ~*at(dirfd, ...)~, use ~fsp_get_pathref_fd()~,
183 - if you want to print the fd for example in =DEBUG= messages, use ~fsp_get_pathref_fd()~,
185 - if you want to call ~close(fd)~, use ~fsp_get_pathref_fd()~,
187 - if you're doing a logical comparison of fd values, use ~fsp_get_pathref_fd()~.
189 In any other case use ~fsp_get_io_fd()~.
191 [fn:manpage] parts of the following sections copied from man open(2)
192 [fn:gitlab] https://gitlab.com/samba-team/devel/samba/-/commits/slow-pathref-wip
194 * VFS status quo and remaining work
195 ** VFS Functions Tables [fn:VFS_API]
196 *** Existing VFS Functions
197 #+ATTR_HTML: :border 1 :rules all :frame border
198 | VFS Function | Group | Status |
199 |-----------------------------------+----------+--------|
200 | SMB_VFS_AIO_FORCE() | [[fsp][fsp]] | - |
201 | SMB_VFS_AUDIT_FILE() | [[Special][Special]] | - |
202 | SMB_VFS_BRL_LOCK_WINDOWS() | [[fsp][fsp]] | - |
203 | SMB_VFS_BRL_UNLOCK_WINDOWS() | [[fsp][fsp]] | - |
204 | SMB_VFS_CHDIR() | [[Path][Path]] | Todo |
205 | SMB_VFS_CHFLAGS() | [[Path][Path]] | - |
206 | SMB_VFS_CHMOD() | [[Path][Path]] | - |
207 | SMB_VFS_CLOSE() | [[fsp][fsp]] | - |
208 | SMB_VFS_CLOSEDIR() | [[fsp][fsp]] | - |
209 | SMB_VFS_CONNECT() | [[Disk][Disk]] | - |
210 | SMB_VFS_CONNECTPATH() | [[P2px][P2px]] | - |
211 | SMB_VFS_CREATE_DFS_PATHAT() | [[NsC][NsC]] | - |
212 | SMB_VFS_CREATE_FILE() | [[NsC][NsC]] | - |
213 | SMB_VFS_DISCONNECT() | [[Disk][Disk]] | - |
214 | SMB_VFS_DISK_FREE() | [[Disk][Disk]] | - |
215 | SMB_VFS_DURABLE_COOKIE() | [[fsp][fsp]] | - |
216 | SMB_VFS_DURABLE_DISCONNECT() | [[fsp][fsp]] | - |
217 | SMB_VFS_DURABLE_RECONNECT() | [[fsp][fsp]] | - |
218 | SMB_VFS_FALLOCATE() | [[fsp][fsp]] | - |
219 | SMB_VFS_FCHMOD() | [[fsp][fsp]] | - |
220 | SMB_VFS_FCHOWN() | [[fsp][fsp]] | - |
221 | SMB_VFS_FCNTL() | [[fsp][fsp]] | - |
222 | SMB_VFS_FDOPENDIR() | [[fsp][fsp]] | - |
223 | SMB_VFS_FGET_COMPRESSION() | [[fsp][fsp]] | - |
224 | SMB_VFS_FGET_DOS_ATTRIBUTES() | [[fsp][fsp]] | - |
225 | SMB_VFS_FGET_NT_ACL() | [[fsp][fsp]] | - |
226 | SMB_VFS_FGETXATTR() | [[xpathref][xpathref]] | - |
227 | SMB_VFS_FILE_ID_CREATE() | [[Special][Special]] | - |
228 | SMB_VFS_FLISTXATTR() | [[xpathref][xpathref]] | - |
229 | SMB_VFS_FREMOVEXATTR() | [[xpathref][xpathref]] | - |
230 | SMB_VFS_FS_CAPABILITIES() | [[Disk][Disk]] | - |
231 | SMB_VFS_FSCTL() | [[fsp][fsp]] | - |
232 | SMB_VFS_FSET_DOS_ATTRIBUTES() | [[fsp][fsp]] | - |
233 | SMB_VFS_FSET_NT_ACL() | [[fsp][fsp]] | - |
234 | SMB_VFS_FSETXATTR() | [[xpathref][xpathref]] | - |
235 | SMB_VFS_FS_FILE_ID() | [[Special][Special]] | - |
236 | SMB_VFS_FSTAT() | [[fsp][fsp]] | - |
237 | SMB_VFS_FSYNC() | [[fsp][fsp]] | - |
238 | SMB_VFS_FSYNC_SEND() | [[fsp][fsp]] | - |
239 | SMB_VFS_FTRUNCATE() | [[fsp][fsp]] | - |
240 | SMB_VFS_GET_ALLOC_SIZE() | [[fsp][fsp]] | - |
241 | SMB_VFS_GET_DFS_REFERRALS() | [[Disk][Disk]] | - |
242 | SMB_VFS_GET_DOS_ATTRIBUTES_RECV() | [[Enum][Enum]] | - |
243 | SMB_VFS_GET_DOS_ATTRIBUTES_SEND() | [[Enum][Enum]] | - |
244 | SMB_VFS_GETLOCK() | [[fsp][fsp]] | - |
245 | SMB_VFS_GET_NT_ACL_AT() | [[Path][Path]] | - |
246 | SMB_VFS_GET_QUOTA() | [[Special][Special]] | - |
247 | SMB_VFS_GET_REAL_FILENAME() | [[P2px][P2px]] | - |
248 | SMB_VFS_GET_SHADOW_COPY_DATA() | [[fsp][fsp]] | - |
249 | SMB_VFS_GETWD() | [[Special][Special]] | - |
250 | SMB_VFS_GETXATTR() | [[Path][Path]] | - |
251 | SMB_VFS_GETXATTRAT_RECV() | [[Enum][Enum]] | - |
252 | SMB_VFS_GETXATTRAT_SEND() | [[Enum][Enum]] | - |
253 | SMB_VFS_FILESYSTEM_SHAREMODE() | [[fsp][fsp]] | - |
254 | SMB_VFS_LCHOWN() | [[Path][Path]] | Todo |
255 | SMB_VFS_LINKAT() | [[NsC][NsC]] | - |
256 | SMB_VFS_LINUX_SETLEASE() | [[fsp][fsp]] | - |
257 | SMB_VFS_LISTXATTR() | [[Path][Path]] | - |
258 | SMB_VFS_LOCK() | [[fsp][fsp]] | - |
259 | SMB_VFS_LSEEK() | [[fsp][fsp]] | - |
260 | SMB_VFS_LSTAT() | [[Path][Path]] | Todo |
261 | SMB_VFS_MKDIRAT() | [[NsC][NsC]] | - |
262 | SMB_VFS_MKNODAT() | [[NsC][NsC]] | - |
263 | SMB_VFS_NTIMES() | [[Path][Path]] | - |
264 | SMB_VFS_OFFLOAD_READ_RECV() | [[fsp][fsp]] | - |
265 | SMB_VFS_OFFLOAD_READ_SEND() | [[fsp][fsp]] | - |
266 | SMB_VFS_OFFLOAD_WRITE_RECV() | [[fsp][fsp]] | - |
267 | SMB_VFS_OFFLOAD_WRITE_SEND() | [[fsp][fsp]] | - |
268 | SMB_VFS_OPENAT() | [[NsC][NsC]] | - |
269 | SMB_VFS_PREAD() | [[fsp][fsp]] | - |
270 | SMB_VFS_PREAD_SEND() | [[fsp][fsp]] | - |
271 | SMB_VFS_PWRITE() | [[fsp][fsp]] | - |
272 | SMB_VFS_PWRITE_SEND() | [[fsp][fsp]] | - |
273 | SMB_VFS_READ_DFS_PATHAT() | [[Symlink][Symlink]] | - |
274 | SMB_VFS_READDIR() | [[fsp][fsp]] | - |
275 | SMB_VFS_READDIR_ATTR() | [[Path][Path]] | - |
276 | SMB_VFS_READLINKAT() | [[Symlink][Symlink]] | - |
277 | SMB_VFS_REALPATH() | [[P2px][P2px]] | - |
278 | SMB_VFS_RECVFILE() | [[fsp][fsp]] | - |
279 | SMB_VFS_REMOVEXATTR() | [[Path][Path]] | - |
280 | SMB_VFS_RENAMEAT() | [[Path][Path]] | ---- |
281 | SMB_VFS_REWINDDIR() | [[fsp][fsp]] | - |
282 | SMB_VFS_SENDFILE() | [[fsp][fsp]] | - |
283 | SMB_VFS_SET_COMPRESSION() | [[fsp][fsp]] | - |
284 | SMB_VFS_SET_DOS_ATTRIBUTES() | [[Path][Path]] | - |
285 | SMB_VFS_SET_QUOTA() | [[Special][Special]] | - |
286 | SMB_VFS_SETXATTR() | [[Path][Path]] | - |
287 | SMB_VFS_SNAP_CHECK_PATH() | [[Disk][Disk]] | - |
288 | SMB_VFS_SNAP_CREATE() | [[Disk][Disk]] | - |
289 | SMB_VFS_SNAP_DELETE() | [[Disk][Disk]] | - |
290 | SMB_VFS_STAT() | [[Path][Path]] | Todo |
291 | SMB_VFS_STATVFS() | [[Disk][Disk]] | - |
292 | SMB_VFS_STREAMINFO() | [[Path][Path]] | - |
293 | SMB_VFS_STRICT_LOCK_CHECK() | [[fsp][fsp]] | - |
294 | SMB_VFS_SYMLINKAT() | [[NsC][NsC]] | - |
295 | SMB_VFS_SYS_ACL_BLOB_GET_FD() | [[xpathref][xpathref]] | - |
296 | SMB_VFS_SYS_ACL_BLOB_GET_FILE() | [[Path][Path]] | - |
297 | SMB_VFS_SYS_ACL_DELETE_DEF_FILE() | [[Path][Path]] | - |
298 | SMB_VFS_SYS_ACL_GET_FD() | [[xpathref][xpathref]] | - |
299 | SMB_VFS_SYS_ACL_GET_FILE() | [[Path][Path]] | - |
300 | SMB_VFS_SYS_ACL_SET_FD() | [[xpathref][xpathref]] | - |
301 | SMB_VFS_TRANSLATE_NAME() | [[P2px][P2px]] | - |
302 | SMB_VFS_UNLINKAT() | [[NsC][NsC]] | - |
303 |-----------------------------------+----------+--------|
305 *** New VFS Functions
306 #+ATTR_HTML: :border 1 :rules all :frame border
307 | VFS Function | Group | Status |
308 |---------------------------------+----------+--------|
309 | SMB_VFS_SYS_ACL_DELETE_DEF_FD() | [[xpathref][xpathref]] | - |
310 | SMB_VFS_FNTIMENS() | [[fsp][fsp]] | - |
311 |---------------------------------+----------+--------|
313 ** VFS functions by category
314 *** Disk operations <<Disk>>
316 - SMB_VFS_DISCONNECT()
317 - SMB_VFS_DISK_FREE()
318 - SMB_VFS_FS_CAPABILITIES()
319 - SMB_VFS_GET_DFS_REFERRALS()
320 - SMB_VFS_SNAP_CHECK_PATH()
321 - SMB_VFS_SNAP_CREATE()
322 - SMB_VFS_SNAP_DELETE()
326 *** Handle based VFS functions <<fsp>>
327 - SMB_VFS_AIO_FORCE()
328 - SMB_VFS_BRL_LOCK_WINDOWS()
329 - SMB_VFS_BRL_UNLOCK_WINDOWS()
332 - SMB_VFS_DURABLE_COOKIE()
333 - SMB_VFS_DURABLE_DISCONNECT()
334 - SMB_VFS_FALLOCATE()
338 - SMB_VFS_FDOPENDIR()
339 - SMB_VFS_FGET_DOS_ATTRIBUTES()
340 - SMB_VFS_FGET_NT_ACL()
342 - SMB_VFS_FSET_DOS_ATTRIBUTES()
343 - SMB_VFS_FSET_NT_ACL()
346 - SMB_VFS_FSYNC_SEND()
347 - SMB_VFS_FTRUNCATE()
349 - SMB_VFS_GET_ALLOC_SIZE()
350 - SMB_VFS_GET_SHADOW_COPY_DATA()
351 - SMB_VFS_FILESYSTEM_SHAREMODE()
352 - SMB_VFS_LINUX_SETLEASE()
355 - SMB_VFS_OFFLOAD_READ_SEND()
356 - SMB_VFS_OFFLOAD_WRITE_SEND()
358 - SMB_VFS_PREAD_SEND()
360 - SMB_VFS_PWRITE_SEND()
363 - SMB_VFS_REWINDDIR()
365 - SMB_VFS_SET_COMPRESSION()
366 - SMB_VFS_STRICT_LOCK_CHECK()
368 If an fsp is provided by the SMB layer we use that, otherwise we use the
369 pathref fsp =smb_fname->fsp= provided by =filename_convert()=.
370 *** Namespace changing VFS functions <<NsC>>
372 - SMB_VFS_CREATE_FILE()
374 All intermediate VFS calls within =SMB_VFS_CREATE_FILE()= will be based on
375 =smb_fname->fsp= if the requested path exists. When creating a file we rely on
376 =non_widelink_open()= which doesn't depend on a dirfsp.
380 Needs a real dirfsp (done).
384 Is only called from within =non_widelink_open()= with a dirfsp equivalent of
385 =AT_FDCWD= and so doesn't need a real dirfsp.
387 The following operations need a real dirfsp:
392 - SMB_VFS_SYMLINKAT()
395 Callers use =openat_pathref_fsp()= to open a fsp on the parent directory.
397 *** Path based VFS functions <<Path>>
398 All path based VFS functions will be replaced by handle based variants using the
399 =smb_fname->fsp= provided by =filename_convert()=.
404 - SMB_VFS_DURABLE_RECONNECT()
406 - SMB_VFS_GET_COMPRESSION()
407 - SMB_VFS_GET_DOS_ATTRIBUTES()
408 - SMB_VFS_GET_NT_ACL_AT()
410 - SMB_VFS_LISTXATTR()
413 - SMB_VFS_REMOVEXATTR()
415 - SMB_VFS_SET_DOS_ATTRIBUTES()
417 - SMB_VFS_STREAMINFO()
418 - SMB_VFS_SYS_ACL_BLOB_GET_FILE()
419 - SMB_VFS_SYS_ACL_DELETE_DEF_FILE()
420 - SMB_VFS_SYS_ACL_GET_FILE()
421 - SMB_VFS_SYS_ACL_SET_FILE()
423 Replace with corresponding handle based VFS calls.
424 *** AT VFS functions that can't be based on handles <<Symlink>>
426 - SMB_VFS_CREATE_DFS_PATHAT()
427 - SMB_VFS_READ_DFS_PATHAT()
428 - SMB_VFS_READLINKAT()
430 As the DFS link implementation is based on symlinks, we have to use *AT based
431 functions with real dirfsps.
433 *** AT VFS functions needed for directory enumeration <<Enum>>
434 - SMB_VFS_GET_DOS_ATTRIBUTES_SEND()
435 - SMB_VFS_GETXATTRAT_SEND()
436 *** Handle based VFS functions not allowed on O_PATH opened handles <<xpathref>>
437 - SMB_VFS_FGETXATTR()
438 - SMB_VFS_FLISTXATTR()
439 - SMB_VFS_FREMOVEXATTR()
440 - SMB_VFS_FSETXATTR()
441 - SMB_VFS_SYS_ACL_BLOB_GET_FD()
442 - SMB_VFS_SYS_ACL_GET_FD()
443 - SMB_VFS_SYS_ACL_DELETE_DEF_FD() (NEW)
444 - SMB_VFS_SYS_ACL_SET_FD()
446 Based upon securely opening a full fd based on =/proc/self/fd/%d= as in the case
447 of xattrs, pathref handles can't be used for xattr IO, and in the case of ACLs
448 pathref handles can't be used to access default ACEs.
449 *** Pure path to path translation <<P2px>>
450 - SMB_VFS_CONNECTPATH()
451 - SMB_VFS_GET_REAL_FILENAME()
453 - SMB_VFS_TRANSLATE_NAME()
456 *** Special cases <<Special>>
457 - SMB_VFS_FILE_ID_CREATE()
458 - SMB_VFS_FS_FILE_ID()
459 - SMB_VFS_GET_QUOTA()
461 - SMB_VFS_SET_QUOTA()
465 - SMB_VFS_AUDIT_FILE()
467 This is currently unused.
469 [fn:VFS_API] ~grep 'SMB_VFS_*' source3/include/vfs_macros.h | grep -v NEXT_ | sed 's|.*\(SMB_VFS_.*\)(.*|\1()|' | sort~