Maniphest T67534

Segmentation fault when opening .blend file in 2.80rc2 that works fine in 2.79b
Closed, Archived

Assigned To
Brecht Van Lommel (brecht)
Authored By
Hubert Xiwuzehugituroxa (hubxiwu90)
Jul 23 2019, 6:21 PM
Tags
  • BF Blender
Subscribers
Alessio Monti di Sopra (a.monti)
Brecht Van Lommel (brecht)
dark999 (dark999)
Hubert Xiwuzehugituroxa (hubxiwu90)
William Reynish (billreynish)

Description

System Information
Operating system: Ubuntu 18.04.2 LTS
Operating system kernel: Linux-4.15.0-55-generic-x86_64-with-debian-buster-sid 64 Bits
Graphics card: NVIDIA Corporation GF108 [GeForce GT 730] (rev a1)
Graphics card driver: NVC1 nouveau 4.3 (Core Profile) Mesa 19.0.2
Using GNOME under X

Blender Version
Broken: version: 2.80 (sub 75), commit date: 2019-07-23 10:51, hash 19aa873f7002
Broken: version: 2.80 (sub 74), branch: master, commit date: 2019-07-18 14:52, hash: 38d4483c6a51, type: Release
Worked: version: 2.79 (sub 0), branch: master, commit date: 2018-03-22 14:10, hash: f4dc9f9d68b, type: Release

Short description of error
Blender 2.80rc2 crashes (Segmentation fault) when opening the attached file.

Exact steps for others to reproduce the error
Open the attached .blend file in Blender 2.80rc2, notice Blender crashes with Segmentation fault.

Generated crash logs

Event Timeline

Hubert Xiwuzehugituroxa (hubxiwu90) created this task.Jul 23 2019, 6:21 PM
Hubert Xiwuzehugituroxa (hubxiwu90) updated the task description.Jul 23 2019, 6:29 PM
William Reynish (billreynish) added a subscriber: William Reynish (billreynish).Jul 23 2019, 6:49 PM

It opens correctly and normally here, although I'm using current master. There's a chance a bug was fixed in the last couple of days. Try a newer build on https://builder.blender.org/download/.

dark999 (dark999) added a subscriber: dark999 (dark999).Jul 23 2019, 7:27 PM

IMO this report is a duplicate of https://developer.blender.org/T67284

Nouveau drivers do not meet the requirements of the Blender 2.80 software. Eevee needed GPU hardware acceleration (CUDA) that Nouveau drivers cannot provide.
IMO To avoid any problems, install the Nvidia proprietary drivers

Alessio Monti di Sopra (a.monti) added a subscriber: Alessio Monti di Sopra (a.monti).Jul 23 2019, 7:51 PM

I can confirm that with Nvidia drivers the file opens correctly in linux, on both rc2 and master.

Hubert Xiwuzehugituroxa (hubxiwu90) added a comment.Jul 24 2019, 7:37 AM

Thank you for the quick response!

I checked on a different computer that is using the Nvidia proprietary drivers and I can confirm it does not crash there. I couldn't get the transparent texture to look the same as in 2.79, but it's a separate thing.
If it was officially decided that nouveau is not supported, it would be nice to warn the user on startup that they are using unsupported drivers that are known to have problems. Otherwise it is hard to tell, because Blender starts up correctly and seems to work until something bad happens.

Hubert Xiwuzehugituroxa (hubxiwu90) added a comment.Jul 24 2019, 7:43 PM

I decided to give my rusty C debugging skills a shot, since I want to use Blender 2.80 on that computer and cannot use Nvidia proprietary drivers there (for policy reasons).

(gdb) bt
#0  0x00007fffcd736680 in nv50_ir::NVC0LegalizeSSA::handleDIV(nv50_ir::Instruction*) (this=this@entry=0x7fffffffae40, i=i@entry=0x7fffbcc5ff90)
    at ../src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp:54
#1  0x00007fffcd73738b in nv50_ir::NVC0LegalizeSSA::visit(nv50_ir::BasicBlock*) (this=0x7fffffffae40, bb=<optimized out>) at ../src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp:334
#2  0x00007fffcd6900d8 in nv50_ir::Pass::doRun(nv50_ir::Function*, bool, bool) (this=this@entry=0x7fffffffae40, func=<optimized out>, ordered=ordered@entry=false, skipPhi=skipPhi@entry=true)
    at ../src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp:495
#3  0x00007fffcd6901b4 in nv50_ir::Pass::doRun(nv50_ir::Program*, bool, bool) (this=this@entry=0x7fffffffae40, prog=prog@entry=0x7fffeada3f80, ordered=ordered@entry=false, skipPhi=skipPhi@entry=true)
    at ../src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp:466
#4  0x00007fffcd690273 in nv50_ir::Pass::run(nv50_ir::Program*, bool, bool) (this=this@entry=0x7fffffffae40, prog=prog@entry=0x7fffeada3f80, ordered=ordered@entry=false, skipPhi=skipPhi@entry=true)
    at ../src/gallium/drivers/nouveau/codegen/nv50_ir_bb.cpp:457
#5  0x00007fffcd732dd4 in nv50_ir::TargetNVC0::runLegalizePass(nv50_ir::Program*, nv50_ir::CGStage) const (this=<optimized out>, prog=0x7fffeada3f80, stage=<optimized out>)
    at ../src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp:3145
#6  0x00007fffcd68d50f in nv50_ir_generate_code(nv50_ir_prog_info*) (info=info@entry=0x7fffb71bfe00) at ../src/gallium/drivers/nouveau/codegen/nv50_ir.cpp:1265
#7  0x00007fffcd6d588a in nvc0_program_translate (prog=prog@entry=0x7fffc0e6e200, chipset=<optimized out>, debug=debug@entry=0x7fffe9a0c3c8) at ../src/gallium/drivers/nouveau/nvc0/nvc0_program.c:624
#8  0x00007fffcd6dd21d in nvc0_sp_state_create (pipe=0x7fffe9a0c000, cso=0x7fffffffb880, type=1) at ../src/gallium/drivers/nouveau/nvc0/nvc0_state.c:605
#9  0x00007fffcd90e963 in st_create_fp_variant (st=<optimized out>, stfp=stfp@entry=0x7fffbb109c30, key=key@entry=0x7fffffffba20) at ../src/mesa/state_tracker/st_program.c:1231
#10 0x00007fffcd911253 in st_get_fp_variant (st=<optimized out>, stfp=0x7fffbb109c30, key=0x7fffffffba20) at ../src/mesa/state_tracker/st_program.c:1258
#11 0x00007fffcd911a7c in st_precompile_shader_variant (st=st@entry=0x7fffe994c800, prog=prog@entry=0x7fffbb109c30) at ../src/mesa/state_tracker/st_program.c:1965
#12 0x00007fffcd9b8e0b in st_program_string_notify (ctx=<optimized out>, target=<optimized out>, prog=0x7fffbb109c30) at ../src/mesa/state_tracker/st_cb_program.c:250
#13 0x00007fffcd9def85 in st_link_shader(gl_context*, gl_shader_program*) (ctx=0x7fffe9b77380, prog=0x7fffbb0e01b0) at ../src/mesa/state_tracker/st_glsl_to_tgsi.cpp:7461
#14 0x00007fffcd981729 in _mesa_glsl_link_shader(gl_context*, gl_shader_program*) (ctx=ctx@entry=0x7fffe9b77380, prog=prog@entry=0x7fffbb0e01b0) at ../src/mesa/program/ir_to_mesa.cpp:3174
#15 0x00007fffcd8ccf8d in link_program (no_error=<optimized out>, shProg=<optimized out>, ctx=<optimized out>) at ../src/mesa/main/shaderapi.c:1206
#16 0x00007fffcd8ccf8d in link_program_error (ctx=0x7fffe9b77380, shProg=0x7fffbb0e01b0) at ../src/mesa/main/shaderapi.c:1286
#17 0x00000000040fd833 in GPU_shader_create_ex ()
#18 0x00000000040fdfc2 in GPU_shader_create ()
#19 0x0000000002a80e98 in DRW_shader_create_with_lib ()
#20 0x0000000002a451f7 in EEVEE_volumes_cache_init ()
#21 0x0000000002a12d76 in  ()
#22 0x0000000002a153e7 in DRW_draw_render_loop_ex ()
#23 0x0000000002cdd3e7 in view3d_main_region_draw ()
#24 0x0000000002d4f151 in ED_region_do_draw ()
#25 0x00000000015178a3 in wm_draw_update ()
#26 0x0000000001514fc0 in WM_main ()
#27 0x00000000010c0abe in main ()

(gdb) p ((struct gl_shader_program*)0x7fffbb0e01b0).NumShaders
$4 = 3
(gdb) p ((struct gl_shader_program*)0x7fffbb0e01b0).Shaders[0].Source
$5 = (const GLchar *) 0x7fffbcc33580 "#version 330\n#define GPU_VERTEX_SHADER\n#extension GL_ARB_texture_gather: enable\n#define GPU_ARB_texture_gather\n#extension GL_ARB_texture_query_lod: enable\n#define GPU_NVIDIA\n#define OS_UNIX\n#define EE"...
(gdb) p ((struct gl_shader_program*)0x7fffbb0e01b0).Shaders[1].Source
$6 = (const GLchar *) 0x7fffbcd4df40 "#version 330\n#define GPU_FRAGMENT_SHADER\n#extension GL_ARB_texture_gather: enable\n#define GPU_ARB_texture_gather\n#extension GL_ARB_texture_query_lod: enable\n#define GPU_NVIDIA\n#define OS_UNIX\n#define "...
(gdb) p ((struct gl_shader_program*)0x7fffbb0e01b0).Shaders[2].Source
$9 = (const GLchar *) 0x7fffbcd5e7c0 "#version 330\n#define GPU_GEOMETRY_SHADER\n#extension GL_ARB_texture_gather: enable\n#define GPU_ARB_texture_gather\n#extension GL_ARB_texture_query_lod: enable\n#define GPU_NVIDIA\n#define OS_UNIX\n#define "...

I attach the dumped shader sources (in case they are dynamically generated):


The top of the stack seems to be pointing to an assert, according to the GitHub clone https://github.com/mesa3d/mesa/blob/19.0/src/gallium/drivers/nouveau/codegen/nv50_ir_lowering_nvc0.cpp#L54

void
NVC0LegalizeSSA::handleDIV(Instruction *i)
{
   FlowInstruction *call;
   int builtin;

   bld.setPosition(i, false);

   // Generate movs to the input regs for the call we want to generate
   for (int s = 0; i->srcExists(s); ++s) {
      Instruction *ld = i->getSrc(s)->getInsn();
>>>>> assert(ld->getSrc(0) != NULL);  <<<<<<<
      // check if we are moving an immediate, propagate it in that case
      if (!ld || ld->fixed || (ld->op != OP_LOAD && ld->op != OP_MOV) ||
            !(ld->src(0).getFile() == FILE_IMMEDIATE))
         bld.mkMovToReg(s, i->getSrc(s));
      else {
         bld.mkMovToReg(s, ld->getSrc(0));
         // Clear the src, to make code elimination possible here before we
         // delete the instruction i later
         i->setSrc(s, NULL);
         if (ld->isDead())
            delete_Instruction(prog, ld);
      }
   }

Poking a bit more in that function seems to suggest that the code generation fails in the main function of the fragment shader, although I have no idea if the uppercase MAIN is actually the same as the shader main or is it something completely different...

#0  0x00007fffcd736680 in nv50_ir::NVC0LegalizeSSA::handleDIV(nv50_ir::Instruction*) (this=this@entry=0x7fffffffae40, i=i@entry=0x7fffbcc5ff90)

(gdb) p ((NVC0LegalizeSSA*)0x7fffffffae40).bld.func.name
$3 = 0x7fffcdd52386 "MAIN"

(gdb) p ((NVC0LegalizeSSA*)0x7fffffffae40).bld.prog.progType
$4 = nv50_ir::Program::TYPE_FRAGMENT

Unfortunately that is as far as my rusty gdb skills let me go. I was not able to track it back to any specific line number or expression in the shader and I don't have the driver source setup for gdb.

Brecht Van Lommel (brecht) changed the task status from Unknown Status to Unknown Status.Jul 25 2019, 12:49 AM
Brecht Van Lommel (brecht) claimed this task.
Brecht Van Lommel (brecht) added a subscriber: Brecht Van Lommel (brecht).

We currently only support NVIDIA drivers, the Nouveau drivers have some bugs. You can report issues to that project, this is nothing we can fix in Blender.

Hubert Xiwuzehugituroxa (hubxiwu90) added a comment.Jul 25 2019, 8:04 PM

I traced the Nouveau driver codegen crash to the following code used in the fragment shader:

IrradianceData load_irradiance_cell(int cell, vec3 N)
{
  /* Keep in sync with diffuse_filter_probe() */

#if defined(IRRADIANCE_CUBEMAP)
...
#elif defined(IRRADIANCE_SH_L2)
...
#else /* defined(IRRADIANCE_HL2) */

  ivec2 cell_co = ivec2(3, 2);
  int cell_per_row = textureSize(irradianceGrid, 0).x / cell_co.x;   // <<! textureSize as input
  cell_co.x *= cell % cell_per_row;                                  // <<! used with modulo
  cell_co.y *= cell / cell_per_row;                                  // <<! or with division

  ivec3 is_negative = ivec3(step(0.0, -N));

  IrradianceData ir;
  ir.cubesides[0] = irradiance_decode(
      texelFetch(irradianceGrid, ivec3(cell_co + ivec2(0, is_negative.x), 0), 0));
  ir.cubesides[1] = irradiance_decode(
      texelFetch(irradianceGrid, ivec3(cell_co + ivec2(1, is_negative.y), 0), 0));
  ir.cubesides[2] = irradiance_decode(
      texelFetch(irradianceGrid, ivec3(cell_co + ivec2(2, is_negative.z), 0), 0));

#endif

  return ir;
}

Replacing textureSize(irradianceGrid, 0).x with a constant, or removing the division and modulo operations below (and replacing them with just assigning the value of cell_per_row) removes the crash.

Hubert Xiwuzehugituroxa (hubxiwu90) added a comment.Jul 26 2019, 3:12 PM

I was wrong. The problem was actually that cell was 0 in the code above.
This bug in Nouveau was fixed recently in the following commit: https://cgit.freedesktop.org/mesa/mesa/commit/?id=7493fbf032f5bcbf4c48187bc089c9a34f04a1d5
Thank you for your help.

Brecht Van Lommel (brecht) added a comment.Jul 26 2019, 3:29 PM

Great to hear, thanks for tracking this down.

Brecht Van Lommel (brecht) mentioned this in T76583: Blender crashes on startup.May 9 2020, 12:59 PM