Maniphest T60462

Blender 2.8 (various versions) crashes in GPU animation rendering (many card types), both eevee and cycles
Closed, Archived

Assigned To
Adam Preisler (Alphisto)
Authored By
Marin Myftiu (mm25)
Jan 13 2019, 10:59 AM
Tags
  • BF Blender
  • Cycles
  • EEVEE & Viewport
Subscribers
Adam Preisler (Alphisto)
akku (akku)
André Ferreira (thebadking)
Brecht Van Lommel (brecht)
brian (brizo)
Francesc Farfán (franfar)
jake downs (jakedowns)
5 More Subscribers

Description

Been having this for quite some time now, with various blender 2.8 beta versions, both on my office and home desktops: I start rendering the animation, it keeps rendering for a random length of time, few frames or few hundred frames, till blender shuts off as if nothing happened.
Cards and CPUs are all stock, I tested RAM stability and its OK
Memory capacity also is not exceeded.
This happens only on 2.8, both cycles and eevee,.. 2.79b renders happily, still model in it but it's like 50% slower and can't seem to live with those times anymore.

System Information
Operating system: Win. 10 (pro and home)
Graphics card: Vega 56 reference + RX480 8GB one PC and Vega56 on the other
(crash happens with both or one card rendering on the dual GPU system)
Both systems have ample amounts of RAM for what I do (16 and 24 GB)
CPUs are a core i7 and AMD 870K

Blender Version
Blender 2.8 , this has been happening for a few weeks now, I try almost daily the beta versions.
On previous blender/amd driver versions there was also a driver crash or system freeze occasionally. Now just shuts off.

Exact steps for others to reproduce the error
It does not look specific to a file; happens with all my projects, both saved in 2.8 and 2.79 and opened in 2.8. You hit render and wait for it to happen.

Related Objects

Mentioned Here
rB7d792976e100: doxygen: update doxygen & add balembic group

Event Timeline

Marin Myftiu (mm25) created this task.Jan 13 2019, 10:59 AM
Marin Myftiu (mm25) updated the task description.Jan 13 2019, 11:04 AM
Marin Myftiu (mm25) added projects: BF Blender: 2.8, Eevee, Cycles.
Brecht Van Lommel (brecht) lowered the priority of this task from 90 to 30.Jan 13 2019, 1:10 PM
Brecht Van Lommel (brecht) removed a project: BF Blender: 2.8.
Brecht Van Lommel (brecht) added a subscriber: Brecht Van Lommel (brecht).

Even if it happens in all your files, we still need to have one simple example .blend to reproduce the problem.

Marin Myftiu (mm25) added a comment.Jan 13 2019, 8:12 PM

OK, after trying this and that I finally narrowed it down to a texture issue: For cycles I added a texture limit in the simplify panel and now it runs fine. In eevee, in scene/shadows panel I dialed the cascade size down to 512 px from 1024.

Marin Myftiu (mm25) added a comment.Jan 14 2019, 10:50 PM

...crashes keep happening, although I get the recently famous white screen freeze now instead of blender just going out. Unfortunately all these files contain parts I can not disclose for now, the only thing I can do is replace part of the file,...hoping to still get the crashes...

Mark Gardner (ut_markle) added a subscriber: Mark Gardner (ut_markle).Feb 18 2019, 10:37 PM

Brand new Blender user here looking to dive in and found this thread for my crashes for animation.

I tried applying a texture limit for the render as suggested and it helped, but only a bit. Blender would usually lock up after 7 frames. With the texture limit it locked up at 36 frames.
Being new to this world I apologize for not knowing exactly what you would need to keep looking in to this. I'd be happy to send you whatever I can to help.

Mark Gardner (ut_markle) added a comment.EditedFeb 18 2019, 10:41 PM

I guess I can start with system specs...
Windows 10 Home 64-bit
Intel i9-9990k
Nvidia RTX 2070
32 GB memory
Blender version 2.807d792976e100

Marin Myftiu (mm25) added a comment.Feb 19 2019, 12:13 AM
In T60462#623064, @Mark Gardner (ut_markle) wrote:

I guess I can start with system specs...
Windows 10 Home 64-bit
Intel i9-9990k
Nvidia RTX 2070
32 GB memory
Blender version 2.807d792976e100

Hello and welcome!
Are you rendering in cycles or eevee?... not that there is any major difference in 2.8, it seems to crash with both, only eevee crashes seem totally random (sometimes it even finishes the video), while cycles seems more prone to crash within the first 50 frames.

Mark Gardner (ut_markle) added a comment.Feb 19 2019, 2:16 AM

I'm rendering in Cycles and it does regularly crash in the first 50 frames. Although I did start at frame 16 and made it all the way through 160 once. Once. *sigh*

Marcus Papathoma (machieb) added a subscriber: Marcus Papathoma (machieb).Feb 19 2019, 9:40 AM

I had the same problem last week. One of my files cashed at random while rendering in cycles. I tried rendering on CPU only but it crashed too. It was impossible to reproduce. Then with an older Beta version of Blender 2.8 it rendered.
I will have a look on it. I have to render a lot in the next weeks, maybe I can reproduce the problem or it is solved already.

Marin Myftiu (mm25) added a comment.EditedFeb 19 2019, 10:52 AM
In T60462#623268, @Marcus Papathoma (machieb) wrote:

I had the same problem last week. One of my files cashed at random while rendering in cycles. I tried rendering on CPU only but it crashed too. It was impossible to reproduce. Then with an older Beta version of Blender 2.8 it rendered.
I will have a look on it. I have to render a lot in the next weeks, maybe I can reproduce the problem or it is solved already.

I believe these crashes are quite frequent and have to do with the way blender handles gpu memory (the fact that they seem consistent through both amd and nvidia cards, of various generations means architecture or drivers have little to do with it).. I just hope they solve this for the 2.8 release.
Another thing I am finding out recently is that 2.79 also crashes in dual gpu renders before the memory limit is reached: I have 2 amd cards of 8 gb each, so it should be 8gb total,.. However when the memory goes above 4gb crashes start to happen.. looks like somehow, blender or the driver allocates about 4gb on each card and cant go past it.. render with only one card and the problem is gone... Can anyone try this on nvidia cards?

Jacques Lucke (JacquesLucke) raised the priority of this task from 30 to 90.Mar 11 2019, 12:47 PM
Mark Gardner (ut_markle) added a comment.Mar 20 2019, 10:32 PM

Not sure if this is still a problem for some people but I did make one change. I had the default folder for renders changed to another drive. When I switch it back to the tmp folder I rendered fine for a full 250 frames. Can't way for sure if it's related to the folder or because I was rendering less intensive projects or not.

Marcus Papathoma (machieb) added a comment.Mar 22 2019, 11:56 AM

For your information, I rendered lots of different sequences in cycles on many computers in our company during the last week. No crash during rendering.

Brecht Van Lommel (brecht) lowered the priority of this task from 90 to 30.Mar 22 2019, 12:04 PM

We still need to have one example .blend to reproduce the problem.

Mark Gardner (ut_markle) added a comment.Apr 3 2019, 9:32 PM


Here's the blend I was using. Again, it appears that the problem comes when I change the default folder for renders to another drive. I've changed it back to the system tmp folder and and rendering without problem

Aleksey (mastermind) added a subscriber: Aleksey (mastermind).EditedApr 5 2019, 12:07 AM
In T60462#645178, @Mark Gardner (ut_markle) wrote:

Not sure if this is still a problem for some people but I did make one change. I had the default folder for renders changed to another drive. When I switch it back to the tmp folder I rendered fine for a full 250 frames. Can't way for sure if it's related to the folder or because I was rendering less intensive projects or not.

Thank you! It helped me! Tried different paths and video formats, nothing of it helped.
P.S. Using RTX 2060 with latest Creator Ready drivers, Windows 10 64bit, Ryzen 7 1700, 16 GB RAM, latest Blender 2.8 64bit, Eeevee render

Aleksey (mastermind) removed a subscriber: Aleksey (mastermind).Apr 5 2019, 12:09 AM
Muhammad Radifar (radifar) added a subscriber: Muhammad Radifar (radifar).Apr 6 2019, 5:54 AM
Muhammad Radifar (radifar) added a comment.Apr 6 2019, 6:02 AM

Same issue here. And what is interesting is, when I change the thread from 4 (max) to 3, it significantly increase the amount of frame rendered before it crashed (around 2-4 times the number of frame). Using both CPU + GPU for rendering.

My system is:
Kubuntu 18.04
Intel Core i5 3570
DDR3 4 GB
NVIDIA GT 730 GDDR5 1 GB
Blender version: 2.80 Daily build (tried new build almost every week since february)

Marin Myftiu (mm25) added a comment.EditedApr 6 2019, 8:45 AM
In T60462#655682, @Muhammad Radifar (radifar) wrote:

Same issue here. And what is interesting is, when I change the thread from 4 (max) to 3, it significantly increase the amount of frame rendered before it crashed (around 2-4 times the number of frame). Using both CPU + GPU for rendering.

My system is:
Kubuntu 18.04
Intel Core i5 3570
DDR3 4 GB
NVIDIA GT 730 GDDR5 1 GB
Blender version: 2.80 Daily build (tried new build almost every week since february)

In your case 1 gb is a bit too low to render on gpu and crashes are to be expected more often. It is however a most likely a memory management issue, since more memory seems to make things better.
Dont know much about Linux, but in windows the "committed memory" increases substantially with each thread used.

Muhammad Radifar (radifar) added a comment.Apr 9 2019, 6:43 PM

After another testing with various settings I found another interesting thing. Somehow, when I set Render > Display Mode > Keep User Interface and using minimum layout (3D viewport, shader editor, and Outliner) EEVEE rendering perfectly fine (in ordinary setting it keeps crashing before 50 frames). I haven't tried this method with Cycles though.

PS. I'm using animation nodes in this testing, and somehow when I change the shader editor to animation nodes it keeps crashing when rendering.

Marcus Papathoma (machieb) removed a subscriber: Marcus Papathoma (machieb).Apr 9 2019, 6:45 PM
Philipp Oeser (lichtwerk) raised the priority of this task from 30 to 90.Apr 24 2019, 10:56 AM
Sebastian Parborg (zeddb) added a subscriber: Sebastian Parborg (zeddb).Apr 25 2019, 4:28 PM

I can't reproduce this on my end with (Linux, AMD).

Francesc Farfán (franfar) added a subscriber: Francesc Farfán (franfar).Jun 15 2019, 7:38 PM
Muhammad Radifar (radifar) added a comment.Jul 24 2019, 10:42 AM

I've been trying to render animation using EEVEE with Blender 2.80 RC1 and RC2 and it keeps crashing. The Cycles seems to be fine though. My system is still the same as above. There is one occassion when EEVEE working using the following trick:

In T60462#657414, @Muhammad Radifar (radifar) wrote:

After another testing with various settings I found another interesting thing. Somehow, when I set Render > Display Mode > Keep User Interface and using minimum layout (3D viewport, shader editor, and Outliner) EEVEE rendering perfectly fine (in ordinary setting it keeps crashing before 50 frames). I haven't tried this method with Cycles though.

PS. I'm using animation nodes in this testing, and somehow when I change the shader editor to animation nodes it keeps crashing when rendering.

That was when I'm using blender-2.80-7ad21c3876c2-linux-glibc217-x86_64 (released on 11 or 12 July 2019).

Stefan Tapper (tappi) added a subscriber: Stefan Tapper (tappi).EditedJul 31 2019, 10:55 AM

I do not know if this is useful information.. having started Blender 2.8 RC3 on a command line with the -d option, this is the result on the random rendering animation crash for me:
Error : EXCEPTION_ACCESS_VIOLATION
Address : 0x00007FF7D8A937E4
Module : E:\blender-2.80rc3-windows64\blender.exe

This happend with either GPU or GPU+CPU(there is not a single image texture in the scene). I am currently trying to reproduce with CPU.. but that takes ages :)
(Quadro P5000 NV 431.02, 2x Xeon E5-2699v3)

Edit: rendering to an external hard drive.

Muhammad Radifar (radifar) added a comment.Jul 31 2019, 1:06 PM
In T60462#740489, @Stefan Tapper (tappi) wrote:

I do not know if this is useful information.. having started Blender 2.8 RC3 on a command line with the -d option, this is the result on the random rendering animation crash for me:
Error : EXCEPTION_ACCESS_VIOLATION
Address : 0x00007FF7D8A937E4
Module : E:\blender-2.80rc3-windows64\blender.exe

This happend with either GPU or GPU+CPU(there is not a single image texture in the scene). I am currently trying to reproduce with CPU.. but that takes ages :)
(Quadro P5000 NV 431.02, 2x Xeon E5-2699v3)

Edit: rendering to an external hard drive.

I tried the same thing in Linux and it just show something like this:

Switching to fully guarded memory allocator.

Blender 2.80 (sub 75)
Build: 2019-07-29 17:17:04 Linux Release
argv[0] = ./blender
argv[1] = -d
/run/user/1000/gvfs/ non-existent directory
Read prefs: /home/radifar/.config/blender/2.80/config/userpref.blend
read file /home/radifar/.config/blender/2.80/config/startup.blend
Version 280 sub 54 date 2019-04-07 15:02 hash 75f551facaf0
found bundled python: /home/radifar/blender/blender-2.80-linux-glibc217-x86_64/2.80/python
Registered Animation Nodes
Read blend: /home/radifar/blender/Practice/collapsing-cube-eevee-02.blend
read file /home/radifar/blender/Practice/collapsing-cube-eevee-02.blend
Version 280 sub 61 date 2019-05-11 18:20 hash ebc44aae9897
Evaluate all animation - 1.000000
No Actions, so no animation needs to be evaluated...
Skipping auto-save, modal operator running, retrying in ten seconds...
Saved: '/home/radifar/Pictures/AN Collapsing Cube - EEVEE 720p/0001.png'
Time: 00:02.63 (Saving: 00:00.16)

Evaluate all animation - 2.000000
No Actions, so no animation needs to be evaluated...
Saved: '/home/radifar/Pictures/AN Collapsing Cube - EEVEE 720p/0002.png'
Time: 00:01.58 (Saving: 00:00.15)

Evaluate all animation - 3.000000
No Actions, so no animation needs to be evaluated...
Saved: '/home/radifar/Pictures/AN Collapsing Cube - EEVEE 720p/0003.png'
Time: 00:01.46 (Saving: 00:00.16)

Evaluate all animation - 4.000000
No Actions, so no animation needs to be evaluated...
Saved: '/home/radifar/Pictures/AN Collapsing Cube - EEVEE 720p/0004.png'
Time: 00:01.46 (Saving: 00:00.16)

Evaluate all animation - 5.000000
No Actions, so no animation needs to be evaluated...
Saved: '/home/radifar/Pictures/AN Collapsing Cube - EEVEE 720p/0005.png'
Time: 00:01.47 (Saving: 00:00.16)

Writing: /tmp/collapsing-cube-eevee-02.crash.txt
Writing: /tmp/collapsing-cube-eevee-02.crash.txt
Segmentation fault (core dumped)

So I tried the --verbose option, and when I'm trying to find explanation about verbosity I found this trick:

blender -b filename.blend -a > nul 2>&1

source: https://blenderartists.org/t/control-blender-verbosity-when-starting-from-command-line/607255/4

And it is working!

Adam Preisler (Alphisto) changed the task status from Unknown Status to Unknown Status.Nov 16 2019, 4:31 PM
Adam Preisler (Alphisto) claimed this task.
Adam Preisler (Alphisto) added a subscriber: Adam Preisler (Alphisto).

Archiving this for long inactivity also necessary to get more information how this now works in 2.80.75 or 2.81.16

André Ferreira (thebadking) added a subscriber: André Ferreira (thebadking).EditedJan 10 2020, 9:46 PM

Check my repo
GitHub.com/thebadking/animationrender
See if my add-on fixes your problem

Jörg Dittmer (NoHow) added a subscriber: Jörg Dittmer (NoHow).EditedFeb 10 2020, 2:42 PM

Wow, sounds again like the usual Windows problem with long running computation on the GPU.
Just posted it on another issue a few minutes ago.
If that happens after > 2sec of freezing the GUI , i have a gut feeling that
the Window Timeout Detection and Recovery (TDR) feature is causing this.

SubstancePainter had same issues and they wrote a nice tutorial on how to change the registry values.
https://docs.substance3d.com/spdoc/gpu-drivers-crash-with-long-computations-128745489.html

I've stumbled accross this problem on various software projects and it often leads to month long bug hunting :(
I'm surprised that the TDR feature is obvousily not well known to most user/deveolpers...
PLEASE SPREAD THE WORD!

André Ferreira (thebadking) added a comment.Feb 10 2020, 3:44 PM
In T60462#868656, @Jörg Dittmer (NoHow) wrote:

Wow, sounds again like the usual Windows problem with long running computation on the GPU.
Just posted it on another issue a few minutes ago.
If that happens after > 2sec of freezing the GUI , i have a gut feeling that
the Window Timeout Detection and Recovery (TDR) feature is causing this.

SubstancePainter had same issues and they wrote a nice tutorial on how to change the registry values.
https://docs.substance3d.com/spdoc/gpu-drivers-crash-with-long-computations-128745489.html

I've stumbled accross this problem on various software projects and it often leads to month long bug hunting :(
I'm surprised that the TDR feature is obvousily not well known to most user/deveolpers...
PLEASE SPREAD THE WORD!

could you test those projects rendering with my addon?

Clément Foucault (fclem) edited projects, added EEVEE & Viewport; removed Eevee.Jun 19 2020, 11:08 PM
akku (akku) added a subscriber: akku (akku).Jul 4 2020, 12:41 PM

hello .. i too have the same problem . i worked using gpu , and at the time of rendering it crashed . so i shifted to Intel HD Graphics 5500 and then rendered , and it worked.
:D

brian (brizo) added a subscriber: brian (brizo).Aug 28 2020, 1:13 PM

Hi folks. Just wanted to put my tupence in to keep this thread alive, and wondering if there's been any move on this problem.
I have two Windows 10 machines with two GPU's in each. Both have a GTX980 Ti, and then have GTX 720 or GTX 650 Ti as secondary.
Up to 2.8 I could render with both cards with no problem, and If there was a memory problem I would get a warning and the render would fail gracefully. Since 2.8 I have been unable to render with both cards even when memory usage is less than 1G, and blender just crashes out - with no error log (since that's not working yet for 2.8 on PC). Some very small renders (200M or so) seem ok.
It sometimes renders for a while and then boom it all dies. I haven't had time to investigate all the possible combinations that might have an effect, and trying the fixes in this thread including the TDR fix have not worked. It's all just a bit frustrating to waste those small but still useful GPU's when I have so much stuff to get done.

André Ferreira (thebadking) added a comment.Aug 28 2020, 3:18 PM

https://github.com/thebadking/animationrender/releases/tag/1.0

try this version 1.0 of my addon to see if the crash keeps happening, it will narrow the problem down if it is a UI issue

brian (brizo) added a comment.Aug 28 2020, 4:08 PM

Hi Folks.
Feeling a bit stupid becuase I decided to spend a little time and do a little experimenting., and my first test seems to have paid off.
I switched off adaptive sampling and can now render on both cards. Perhaps I have missed a memo regarding when not to use AS?

A little more digging is required to see if it's something to do with my settings, but it seems adaptive sampling on multiple and differing GPU's is unreliable.

brian (brizo) added a comment.Aug 28 2020, 6:26 PM

ok.. after my fix of not using Adaptive Sampling, I have been rendering large (15000x15000) images on two machines, both using 2 gpus. One has been at it for over 3 hours (lots of transparancy) and still going strong, the other was almost complete on an image after about 2 hours when poop! it dissappeared. So AS seems to have something to do with it.. but not the complete answer. I should perhaps try Andre's manager add on. I'll shut up now, unless/until I find something useful.

jake downs (jakedowns) added a subscriber: jake downs (jakedowns).EditedSep 5 2020, 8:49 AM

note i’ve been running into a similar issue all of a sudden using 2.82. tried upgrading to 2.9 but the issue still persists.

ryzen 3900x
64 ram
rtx 2080 super 8gb

i’ve tried combinations of

  • default output path, vs custom output path
  • cuda cpu, gpu, & hybrid
  • optix gpu rendering mode (rtx)
  • cycles & eevee

seems to happen more often in cycles for me.

i did not know about the ability to run via command line with -d and increased verbosity. i’ll try that next

i’ll also try this cli-render queue addon if all else fails

thanks for the tip on adaptive sampling. that seems to be a new feature? and i forgot i had enabled it. i’ll try disabling it next time it crashes.

same thing here though, sometimes it’ll make it through 500 frames, sometimes it’ll close without error after 5 or 50. even when i burn peak mem usage on the renders i don’t see it going above 2gb (although with Windows overhead it’s probably eating 6+ gb of my measly 8gb card)

making sure to disable my second monitor, and close other apps like chrome seems to have helped. i just wish blender was somehow smarter about gpu resources and could gracefully wait out or throttle during times when memory usage is going to be less than ideal. i’m sure there’d be other ramifications and drawbacks/impracticalities to such a check. but even being able to say like; ok blender don’t try and allocate more than X gpu mem. or at the very least, throw a human readable error like it previously did when it failed to allocate cuda resources.

i’m sure there’s windows/nvidia/amd api limitations at play here too.

just wanted to chime in that it’s suddenly affecting me as well

i tried upgrading to the nvidia studio driver instead of the game ready one but that didn’t seem to make a difference.

curious if anyone else has denoising enabled?
even when i’m rendering with cpu, to avoid memory gpu limit, i think i forgot to disable optix denoising, which seems to still eat up a good chunk of gpu resources and could cause the crash somehow?

oh i’ll also try the texture size limit too. thanks for that tip as well.

oh one other note. a few weeks ago i ran like a 3-day render in the background on my rig while carrying out three regular workdays of programming, browsing the web, watching youtube etc. without any issue. that’s why it’s kind of stunning that all of a sudden my last two projects have had issues. they’re really simple setups too. one only has a mid-sized jpg texture. anyway, that’s all. 😌🙏

update

  • disabling adaptive sampling definitely increased the time before crash, but it still crashed.