Maniphest T101382

Asset Browser aggressively requests stat calls
Closed, Archived

Assigned To
None
Authored By
Pavel Duong (vignette)
Sep 26 2022, 3:44 PM
Tags
  • BF Blender
Subscribers
Harley Acheson (harley)
Pavel Duong (vignette)
Pratik Borhade (PratikPB2123)

Description

System Information
Operating system: Arch Linux, Windows 10
Graphics card: Intel UHD 620, nVidia GTX 1060 Super respectively

Blender Version
Broken: 19ae71c11342 (3.3.1 RC)
Worked:

Short description of error

Note: Asset Indexing is enabled in the Experimental tab.

The asset browser (monitored by strace on Linux) very frequently calls for stat(), even when the mouse is just hovering over already loaded assets' thumbnails. While this doesn't seem to be an issue with my Linux setup, where I assume the calls are cached by the kernel, it is very stuttery when the asset library is on a network SAMBA share on Windows.

To demonstrate, here are videos from Linux and Windows. On Linux, the asset library is locally stored, whereas on Windows, the asset library is on a network share, provided by the Linux system.

As you can see from the terminal output, I've used two commands, one to output traced read and stat calls and one to count them as they come. And just by hovering over the asset browser we can see the high amount of calls, however given that they are stored locally (and cached by kernel I presume), the performance remains high and no stuttering is felt on my side.

On Windows, just plain opening the asset browser takes at least a minute to load the index/thumbnails (not shown on video). However, after we get to the same state as in Linux, we observe the stuttering when hovering. Unfortunately, I don't have enough knowledge on Windows' workings, so I can't provide any system trace calls at the moment.

However, to demonstrate the issue, a simple playback shows us the FPS in the viewport's topleft corner, and we can see a significant dip once we begin to scroll/hover over the asset browser (in provided video around 20s). Also the highlighting when a thumbnail is being hovered over lags behind the cursor by a lot.

There's also the issue of the slow loading of the index/thumbnail on a cold launch, even if the thumbnails/index were generated previously, but I suppose that might be a topic for a different issue.

Exact steps for others to reproduce the error

  1. Have an asset library in a network share that contains relatively high number of files (for my case, PolyHaven library)
  2. Add the asset library and open up the asset browser
  3. (Possibly wait for indexing to be done/thumbnail generation, can take a long time)
  4. Notice that even just hovering above the thumbnails the Blender interface lags behind (eg. the thumbnail highlighting is way behind the cursor)

Event Timeline

Pavel Duong (vignette) created this task.Sep 26 2022, 3:44 PM
Pavel Duong (vignette) updated the task description.Sep 26 2022, 6:10 PM
Pavel Duong (vignette) updated the task description.
Harley Acheson (harley) added a subscriber: Harley Acheson (harley).Sep 27 2022, 12:24 AM

Honestly not sure what the provided information can achieve. It might be a very good starting point for investigation if you were considering attempting to optimize this performance yourself. But on its own it doesn't say anything conclusive.

There is an assertion here that the "stuttering" performance experienced is related to the number of stat calls shown, but evidence that these things are connected would be found in actually profiling the code. Otherwise it seems to be just a guess from the only type of data collected. By what metric do you judge these stat calls to be too many to the point of "aggressive"? How many is the correct number?

There could be more calls to stat than is required. We might be calling it to see if a file (and the thumbnail for it) exists and then also calling it again to find the file dates or file size. But... stat calls are extremely fast and would not account for the long delays shown in your second video. To understand what is happening there requires that you load up the source and profile to see what is taking up the time. There will be something, and likely something that can be improved. But it won't be the stats.

Pavel Duong (vignette) added a comment.Sep 27 2022, 1:10 AM
In T101382#1423146, @Harley Acheson (harley) wrote:

Honestly not sure what the provided information can achieve. It might be a very good starting point for investigation if you were considering attempting to optimize this performance yourself. But on its own it doesn't say anything conclusive.

Just thought to attempt to raise a potential issue more than anything, sorry if this isn't the right channel for it.

There is an assertion here that the "stuttering" performance experienced is related to the number of stat calls shown, but evidence that these things are connected would be found in actually profiling the code. Otherwise it seems to be just a guess from the only type of data collected.

I'll have to admit, those are indeed guesses based on the fact, that one stores the library in a physical (thus faster) device and the other on a network (way slower) share.

I've tried to at least remove the OS factor from it and tested the same setup on two Linux devices, one connected to the other similiarly like how Windows was connected to the SMB of the Linux, with the additional difference of being linked on Wi-Fi, as opposed to being linked by cable.

The same effect has occured and the strace command showed that each stat can take up to 0.5 second, which quickly stacks up.

By what metric do you judge these stat calls to be too many to the point of "aggressive"? How many is the correct number?

As in the second part of the first video, where the calls are counted, they have quickly reached thousands from just hovering over with a mouse. I'd assume at the very least that such information shouldn't really be needed to be accessed from the filesystem again so often and so it'd be cached in memory or at least somewhere local.

As to how many is the correct number, I'd say that tens of thousands while just hovering the mouse is probably not the correct one.

There could be more calls to stat than is required. We might be calling it to see if a file (and the thumbnail for it) exists and then also calling it again to find the file dates or file size.

I understand that, but is it necessary to call it on every frame the mouse hovers over the thumbnail/asset library?

But... stat calls are extremely fast and would not account for the long delays shown in your second video.

They might be fast on local devices, but in usecases such as the asset library being shared and stored on a network share (or even a slow media storage) might pose some trouble, when the delays stack up with the amount of files.

To understand what is happening there requires that you load up the source and profile to see what is taking up the time. There will be something, and likely something that can be improved. But it won't be the stats.

Indeed, as I've said, it was just a guess and a shot in the dark. It might not be directly the stat calls, but it seems to be linked with the asset library being on a slower storage device.

I understand this kind of "unscientific" way of investigation isn't exactly the best, however I lack the resources to provide detailed analysis, so I just left it at attempting to raise a potential issue.

Harley Acheson (harley) added a comment.Sep 27 2022, 1:46 AM

@Pavel Duong (vignette) - Just thought to attempt to raise a potential issue more than anything, sorry if this isn't the right channel for it

Yes, no worries. Although this doesn't "fit" as a bug report - and will probably just get closed - I will probably poke around in that code there when I get a chance.

I'll have to admit, those are indeed guesses based on the fact, that one stores the library in a physical (thus faster) device and the other on a network (way slower) share.

There are probably some other more intensive file accesses in there, like reads.

with the additional difference of being linked on Wi-Fi, as opposed to being linked by cable.

Yes, there will lots of performance bottlenecks if connecting to a file share over a slow connection like wifi. Especially creating thumbnails of files since that requires reading the entirely of each file to make each preview.

...but is it necessary to call it on every frame the mouse hovers over the thumbnail/asset library?

To answer that you would have to know about the internals of that code.

...but in usecases such as the asset library being shared and stored on a network share (or even a slow media storage) might pose some trouble, when the delays stack up with the amount of files.

For sure, but in this case it will be other file activity, like accesses and reads that will contribute to most of that time, not the stats.

...however I lack the resources to provide detailed analysis, so I just left it at attempting to raise a potential issue.

No worries. I really do appreciate the attempt to help. But this requires that someone else do the actual profiling to find the root of the issue and then improve it. This just doesn't fit within this system of bug reporting.

Pratik Borhade (PratikPB2123) closed this task as Archived.Sep 27 2022, 7:11 AM
Pratik Borhade (PratikPB2123) added a subscriber: Pratik Borhade (PratikPB2123).

Hi, thanks for the report. As far as I understand from the above discussion, stat() calls are not entirely responsible for the slowdown, right?
Also to note:

  • We need reliable way to redo the problem in order to fix them
  • Reports on performance improvements are not considered as bug reports

Closing this ticket for now. Don't hesitate to comment or reopen the report if there is misunderstanding

While we do continue to work on improving performance in general, potential performance improvements are not handled as bug reports.
To improve performance, consider using less complex geometry, simpler shaders and smaller textures.