Following <kcref>last weeks' discussions
</kcref>, David Howells, Alexandre Julliard and Gavriel State resumed
their exchanges.
David re-iterated his main gripe against the slow speed
of access to files... Every Read/WriteFile goes to the wineserver to
convert the handle into a file descriptor and to check for
locking. The FD is then passed back over a UNIX domain socket, used
once and then closed.
Alexandre Julliard explained this had just been enhanced: the file
descriptor is only transferred once. All subsequent accesses only
check if the file descriptor on client's side is still valid, hence
reducing the complexity and the length of the server call (but, not
the number of calls).
The latency of the Wine server call is rather high as David explained:
Context switching is the main element of
it. Going to the wineserver and back again just for a ReadFile() call
or a Wait*() function incurs a fairly serious penalty (particularly on
an X86, I think). Plus there's no requirement for the kernel to pass
the remains of your timeslice to the wineserver and back again.
Since the context switch also implies that you have to
flush all the CPU caches, muck around with the MMU and execute
scheduling algorithms, this can explain some of the latency.
However, Alexandre thinks that it should be possible
to improve that by a small kernel hack. It will never be as fast as
doing everything in the kernel of course, but it may just be fast
enough to avoid the need to reimplement the whole server. and
that we are doing more than two switches (though I
haven't proved it), which is why I think there is a margin for
improvement. You'll obviously always have the context switch cost
unless everything is in the kernel.
By a small kernel hack
, Alexandre means
having a specialized fifo, a network protocol, an ioctl,
etc. Basically any mechanism that ensures that we do the strict
minimum number of context switches and schedule() calls for a server
call. And probably also a way to transfer chunks of memory from the
client address space so that we don't need the shared memory area.
David already suggested a new protocol (AF_WINE) which could
nicely fit into this category (and also let the ability to use the
internal API on non Linux platforms, although the kernel module had to
be rewritten).
David also asked Alexandre how does he plan on doing
the locking stuff for Read/WriteFile? Cache it locally? It is
unfortunate, but you can't really make use of UNIX file locking, since
this is mostly advisory and as such doesn't actively stop read/write
calls. Alexandre quickly replied Yes, we'll
need to store the locks in the server and check them before each
read/write (and probably also release them afterwards if
necessary). There may be some optimizations possible, but we should
probably do it the easy way first. This would, of course,
require some more server calls.
Later on, Gavriel explained that Alexandre would unlikely accept a
huge patch at once, and that he'd rather have an incremental approach.
Alexandre answered, but also spoke out some directions for adding such
a kernel module David is working on into Wine:
The kernel module itself may be hard to do incrementally, but you
should really consider reusing the existing server API so that your
module can be plugged in easily. For instance your module entry points
should be the same as the server requests, and use the same request
structures.
As a reminder, David used the int 0x2E trap (as any NT system does) to
hook the kernel module up to the Wine code, putting more into the
Linux kernel than Wine currently does with its wineserver. However,
this introduces another API into Wine, and makes it quite difficult to
maintain the two APIs (the INT 0x2E and the wineserver's).
Alexandre explained what he had in mind a bit more clearly:
I'm not suggesting keeping the current socket stuff,
just reusing the structures. So basically instead of passing the
address of the stack arguments (which is really ugly IMO) to your
ioctl, you pass one of the server request structures. This allows your
changes to be localized to wine_server_call and doesn't require
changing any of the routines that make server calls. Obviously you'd
need some more changes for a few calls like ReadFile/WriteFile, but
most operations could switch to your mechanism without needing any
change. You simply cannot require people to recompile all of Wine to
use your module.
David also pointed out some strange issues with Wine loader. After
some discussion, it turned out that alignments required by mmap did
change between Linux 2.2 and 2.4. Wine did made the assumption that
Page alignment is needed for the address in memory,
not for the offset inside the file on disk; since section virtual
addresses in PE files are always page-aligned the memory address is
never a problem. The only problem comes from the alignment of the data
inside the PE file, and this is where we only need block-size
alignment to make mmap possible. David also proposed some
enhancements for the Linux 2.4 kernel.
As a (temporary) conclusion, the area of optimizing the Wine
architecture is still under heavy discussion. Many tracks are
available, and the potential results/benefits are still not 100%
clear. On the bright side, there's still lots of space for
improvement.
|